

< his chap,er ■ you willb ° abl ° to iMMtmt 

' age processing system 

] Lexical analysi s 
' syntax analysis 

* conte xt free grammars and ambiguity 

Types of parsing 

t T op down parsing 


Bottom up parsing 
Conflicts 

Operator precedence grammar 
LR parser 

Canonical LR parser(CLR) 


Processing System 

Language Processors 

Interpreter 

I, is a computer program that executes instructions written in a 
programming language. It either executes the source code directly 
ortranslates source code into some efficient intermediate represen- 
ntion and immediately executes this. 


Source program 
Input 


Interpreter 


♦ Output 


High level 
program 

(source program) 


► Compilor 

T 

Error messages 


Low level 
program 
(target program) 


Passes 

The number of iterations to scan the source code, till to get the 
executable code is called as a pass. 

Compiler is two pass. Single pass requires more memory and 
multipass require less memory. 


Example: Early versions of Lisp programming language, BASIC. 

Translator 

A software system that converts the source code from one form of 
language to another form of language is called as translator. 
There are 2 types of translators namely (1) Compiler (2) Assembler. 

? ^ or npiler converts source code of high level language into low 
hel language. 

Assembler converts assembly language code into binary code. 


Analysis-synthesis model of compilation 

There arc two parts of compilation: 


Compilation 



Analysis Synthesis 

(front end) (back end) 


Compilers Analysis It breaks up the source program into pieces a d 

■J! mpiler a software that translates code written in high-level an intermediate representation of the source program This k 
P* 86 source language) into target language. language spectfic. ..... more 

, tople; SOUrce | anguas i:k p j ava e t C- Compilers are 

■ ^friendly. 88 ’ ’ Synthesis It constructs the desired target program fr , • 

|language is like machine language, which is efficient 
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i 0 ( a compiler 

Front end w back en I ^ intcrmcdiaic 
The front end incliKlos a " ‘ (>pl jmizjition. 

ohIo gcne.ntorw.il. part oTcch _ . 

j ~Error hondloiK 


Soul CO y 
procirnm 



V 

Intermediate 

code 

generator 


adress 

code 


Syntax analyzer or r> Qr 

• Tokens arc grouped hi Cnrp ? er ; 

with collective meaning hlca 'ly im 0 t 

• A context free grammar ( N* 

productions for identify*^) \ 

a programming language T?NSi 
derivation tree. ' r ^ c output • ^ 

15 a Ojl{ 


M 

Example: Parse tree f 0r -r ,. 

\ U1 -f : r i. 

grammar: '«) u S i n . 


A 


from intermediate code. 

Context of a compiler 

In addition to a compiler, several other programs may c 
required to create an executable target program, like pre¬ 
processor to expand macros. 

The target program created by a compiler may require 
further processing before it can be run. 

The language processing system will be like this: 

Source program with macros 


£->£ + £ 
£->£* E 
E —> -E 
E->(E) 

E —> id 


Usin 8 the r 

Cf %, 


(C,) 



Absolute machine code 

Phases 

Compilation process is partitioned into some subproceses 
called phases. 

In order to translate a high level code to a machine code, 
we need to go phase by phase, with each phase doing a par¬ 
ticular task and parsing out its output for the next phase. 

Lexical analysis or scanning 

I t is the first phase of a compiler. The lexical analyzer reads the 
stream of characters making up the source program and groups 
ie c laractcrs into meaningful sequences called lexemes. 
Example: Consider the statement: if (a < b) 
n this sentence the tokens are if, («, <, h) 

Number of tokens = 6 
Identifiers: a, b 
Keywords: if 



E 

A 

( E ) 

ft 

id id 


Semantic analysis 

• It checks the source program for semantic errors 

• Type checking is done in this phase, where fee 
checks that each operator has matching ope^f 
semantic consistency with the language definition' 

• Gathers the type information for the next phases 

Example 1: The bicycle rides the boy. 

This statement has no meaning, but it is syntactics 
correct. 

Example 2: 

int a; 
bool b; 
char c; 
c = a + b; 

We cannot add integer with a Boolean variable and ass? 
to a character variable. 

Intermediate code generation 

The intermediate representation should have two i®?- 1 ”' 
properties: 

(i) It should be easy to produce. 

(ii) Easy to translate into the target program 

‘Three address code’ is one of the common 
Intermediate code. 0 f instr^ 

Three address code consists of a sequence 
tions, each of which has at most three operan s ’ 

Example: 


id 2 + id. x 10; 
inttoreal (10) 


A- ' : 


& 
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✓ p " aSC rCS "" in fe'cr running 

$<*.* •*° vc imcrmcdiM,: «»i« ti,n optimi?eil 

, 0 , 

E '' Eliminated Land /, registers. 

pi*** 

,Jegen erati0n 

t° phase, the target code is generated. 

■'"Slly the target code can be cither „ relocatable 
• code or an assembly code. 

Mediate instructions arc each translated into a 
' InI cnee of machine instructions. 

^jsninent of registers will also be done. 

Ftainplft m° vf id >' R r 

1 NULF * 60 . 0 , R. 

MOVE id 2 , R x 

ADDF ^2' 

KOVF R 1# id : 
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parser for cE J 1r< *8 ra m. Hie stream of tokens is sent to the 
. ^ lor syntax analysis. 

nerc will u J it 

hc mtcraction with the symbol table as well. 

| Source program 

LoxicaTl 
analyzer |* 



Symbol table management 

. .^bol table is a data structure containing a record for 

>-h variable name, with fields for the attributes of the 

U-JlW. 

flat is the use of a symbol table? 

I Jo record the identifiers used in the source program. 

2. Its type and scope 

3. if it is a procedure name then the number of argu¬ 
ments, types of arguments, the method of parsing (by 
reference; and the type returned. 

hror detection and reporting 

( i) Lexical phase can detect errors where the characters 
remaining in the input ‘do not form any token’. 

^ Errors of the type, ‘violation of syntax’ of the language 
are detected by syntax analysis. . 

Semantic phase tries to detect constructs that ia\c 
‘•c'm syntactic structure but no meaning. 

arn Pk: adding two array names etc. 

J^'cal Analysis 

S ! E nalysis >s the first phase in compiler deslgn ‘ , J 

Lu. E‘ ( °E l hc lexical analyzer is to read tht, in P^ . 
* eSOUrce Program, group them into lex^ 

* output a sequence of tokens lor 


Lexeme: Sequence of characters in the source program 
that matches the pattern for a token. It is the smallest logical 
unit of a program. 

Example: 10, .v, y, <, >, = 

Tokens: These arc the classes of similar lexemes. 

Example: Operators: <, >, = 

Identifiers: x,y 
Constants: 10 
Keywords: if, else, int 

Operations performed by lexical analyzer 

1. Identification of lexemes and spelling check 

2. Stripping out comments and white space (blank, new 
line, tab etc). 

3. Correlating error messages generated by the compiler 
with the source program. 

4. If the source program uses a macro-preprocessor, the 
expansion of macros may also be performed by lexical 
analyzer. 

Example 1: Take the following example from Fortran 
DO 5 1= 1.25 
Number of tokens = 5 
The 1st lexeme is the keyword DO 
Tokens are DO, 5,1, =, 1.25. 

Example 2: An example from C program 
for (int / = 1; i < = 10; i + +) 

Here tokens are for, (, int, i, = 1,;, i, < =, 10,;, 
i,++,) 

Number of tokens = 13 


LEX compiler 

Lexical analyzer divides the source code into tokens. To 
implement lexical analyzer we have two techniques namely 
hand code and the other one is LEX tool. 

LEX is an automated tool which specifies lexical ana- 
1 - ze r from the rules given by the regular expression. 
y ese ru ies are also called as pattern recognizing rules. 
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** is "; c **£*•* ' Z ^« rsc/ sy '" ,,x lrcc ' 

w C^ rislokcn . 

ConstructinS P°” e for . given * 

Consider the grammar 

s->f +f/£ * £ 

E —>id 

t ,, for the string 
The parse tree ioi 

m = id + id * ‘d lS 



E + 


E 

/'N 


id id 


E 

id 


+ id * id 


if the parser 

=-?:sE»i= 

—rrssi 

^Gwill only check the correctness of sentence with 

:pect to syntax not the meaning. 


Lexical 

Token ^ 

Parser 

analyzer 

[Get next 1 



token 

Syntax 

Lexical 


errors 


Parse 
* tree 


top 


errors 

met a parse tree? 
m be constructed in two ways. 

, n parser: It builds parse trees from the 

the bottom (leaves), 
up parser: It starts from the leaves and works 

: root. 

the input to the parser is scanned from left 
ibol at a time. 


to 



or 

a tool which creates a parser. 

zr - compiler, YACC 

parser generator is grammar we use and 
lie parser code. 


The parser generator is us ef . r 

pilcrs front end. ar ° r 


c °nst, 


ru di 


f 


Scope of declarations 

Declaration scope refers to t| )c c . 
in which rules are defined by th c t‘ Unpr °8i 
Within the defined smn n ... ai1 ^ Ua Ke. 


s 


X 


' ra oit. 




ac ce 3 


\ 




V. 


Within the defined scope, entiT^ 3 ^' 

declared entities. 

The scope of declaration contain, 
always. Immediate scope is a region o"f c'^atc 
with enclosure of declaration immediate! ' 

Scope starts at the beginning 0 f decla - 
continues till the end of declaration. ^ 
loadable declaration, the immediate scope 
the callable entity profile was determined. 

The visible part refers text portion ofd ec | ara{ 
is visible from outside. '°H 

Syntax Error Handling 

1 Reports the presence of errors clearly and 

2 Recovers from each error quickly. 

2 it should not slow down the processing of ^ 
programs. 

Error Recovery Strategies 



Panic Phrase level Error Global 
mode productions correction 

/ Hi covering an error, the parser to k 

injurt symbols^one at aTime^until one of the syuctoar: 

T^Ta' parser may perform local — 

^iSlnptSy^Liheprehxo,^ 

Error productions Parser can 
messages to indicate the erroneous 
recognized in the input. ^ 

Global corrections There are algW" globally la® 1 ’ 
minimal sequence of changes to obtain 

correction. 

Context Free Grammar 5 
and Ambiguity donS «i 

A grammar is a set of rules or pr 
a collection of finite/infinite stn g ^ ^ 

It is a 4-tuple defined as G — ( 

Where 

V - set of variables 
T= set of terminals 
P = set of production rules 
S = start symbol 


, . - . * '»'‘ 























/ 


(S)!c 
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K'^(5) 


X' 5 ^r n! 

5 ^ * ,hol and the only variable. 

• 4,3,1 f 

^ «t> c,ion nllCS ‘ 

ri 5 \i «rcP r 

/ 


itrlnq 


Strjncj L S,f j nt > 
9 


\ 

String 


y> 

'al f° rrrlS 

iffifl* /v may contain non-terminals, then we 




say 


. ^ l ". 3 

j/ 15 A sentence is a sentential form with 


no 




(/rf + ^0 * s a sentence of the grammar (Gj). 


Derivations 


Right most derivations 
E=>-E=>-(E) 
=*-(E+E) 

=> -(E + id) 

=> -(id + id) 


KiSSSflS" 

-(id + £) 

^ _(id + id) 

most derivations are also known as canonical 


Right 

derivations. 


String 

I i 

5 2 

Figure 2 Rightmost derivation 

Ambiguity is problematic because the meaning of th 
P r ogram can be incorrect. 

Ambiguity can be handled in several ways 

1. Enforce associativity and precedence . 

2. Rewrite the grammar by eliminating left recursion 
left factoring. 

Removal of ambiguity 

The grammar is said to be ambiguous if there exists mor 
than one derivation tree for the given input string. 

The ambiguity of grammar is undecidable, am tgm 
a grammar can be eliminated by rewriting the grammar. 

Example: 

E->E + E/id} 

E—>E + TIT 
T —»id 


> ambiguous grammar 
rewritten grammar 
(unambiguous grammar) 


A 

>h 


id 


id 


Ambiguity 

Agrammar that produces more than one parse tree for some 
sentence is said to be ambiguous. 

Or 

A grammar that produces more than one left most or 
than one right most derivations is ambiguous. 

For example consider the following grammar. 

String —> String + String/String - String /0/I/2/. 

^ ~ 5 + 2 has two parse trees as shown below 


String 


String 



String 
9 5 

Figure 1 Leftmost derivation 


Left recursion 

Left recursion can take the parser into infinite loop so we 
need to remove left recursion. 

Elimination of left recursion 
A -> AaJf} is a left recursive. 

It can be replaced by a non-recursive grammar: 

A —> PA' 

A' —» otA'Ie 

In general 

A -» Aa l /Aa 1 /.../AaJp l /p 2 /.../P n 

We can replace A productions by 

A^> A'IA'l-(5 n A' 

A' a, A’la 2 A’/-a m A' 

Example 3: Eliminate left recursion from 

e->e + tit 
f_» f* F/F 
F^>(E)I id 

Solution E-+E+T/T it is in the form 
A -> AcdP 

So we can write it as £" —> TE' 

’ E'->+TE'/e 

Similarly other productions are written as 

T-^ FT' 

V -> x FT'/e 
F —» (£)/id 
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Compiler Design 

inalc ,ca recursion from the gn, 

v -» (iy° 

/. _ /.. 5/* 


rammar 


Let the 


Example 4 Eliminate 


string H’=: Cn ... 

Cad 'S to r„ 


reflxe5 is called non-detennin- 
• wrf* need to remove 


Solution: S->(/•)/" 

L -> bV 

Left factoring 

A grammar with conlI ”° J eter ministic we need to 
Me grammar. To make adete _ ^ ^ uft Facl 0 nng. 

common !>«««„ p can b e transformed into 
The grammar: A —> 

,4 -> a/ 

A'-4 Pft 

Example 5: Wha, is to resnltan, grammar ader led 

factoring the following grammar. 

iEtS/iEtSeS/a 

E->b 

Solution: S —> iEtSS la 
S'-^eS/e 
E->b 

Types of Parsing 

Parsers 


8cn C] 

S 

/IX 

A* 

a N 




The string generated from th 
but. if = cad, the third sZu*. ab °v c 


S ^ bo,i< >no,S Sc tr ei 


So, report error and go bC J 

Now consider the 


"SIN 


s 

/|\ 

c A d 


Prod 


'"ctin, 


Topdown parsers 
(predictive parser) 


Bottom up parsers 


Recursive descent Non-recursive Operator 
parsing descent precedence 

parsing parsing 


(LR parsers) 


SLR CLR LALR 


Topdown Parsing 

A parse tree is constructed for the input starting from the 
root and creating the nodes of the parse tree in preorder. It 
simulates the left most derivation. 

Backtracking Parsing 

1 WC d sec l uenc e of erroneous expansions and sub- 
W W£ und0 "»«"d roll 

This method is also known as brine force parsing 
Example: S -> cAcl 

A —> tibia 


String generated ‘cad’andw-c d 
In this we have used back 10 ,*“. Sii* 
consuming approach. Thus an outdated 

Predictive Parsers 

By eliminating left recursion and by left f 
mar, we can have parse tree without bac ?? 8 V 
struct a predictive parser, we must know, 

1. Current input symbol 

2. Non-terminal which is to be expanded 

A procedure is associated with each non-terminal 
grammar. 


01 


Recursive descent parsing 

In recursive descent parsing, we execute a set ofrec^. 
procedures to process the input. 

The sequence of procedures called implicitly, de6&> 
parse tree for the input. 

Non-recursive predictive parsing 

(table driven parsing) 

• It maintains a stack explicitly, rather than implicit • 
recursive calls. 

• A table driven predictive parser has 

—> An input buffer 
—> A stack 
—> A parsing table 
—> Output stream 



. OutP u * 
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F 


parsing tabic 


./‘‘"Cins l3blC ' haVC lo lc »n abo 

/y’ 

\ f > 


Ul two 


! l (i . c omP lltc F,R ST('V) foraM grammar symboic 

. nom ~"^ 

'i™ firsts isW . 

■ ifX* * is a production, then add e i 0 FIRSTf y\ 

! If-'^non-tcrminal and 1', - Y L is a productj 

; «■’ % -a' in FIRSTLY) if for some i, a i s an F|R ’ 

fir st, 

(1, ]' ,=>G.Ife ,s 111 FIRST(K) for al | . 

*■ ■' k then add e to FIRSTLY). For example cvcrv’ 

FIRST (F.) is surely in FIRST(A0. If V,do^ 

,derive e• then add nothing more to FIRSTf but if 

11 .JJ UIDCT -1 _ 


no 


• 


then add FIRST (F,) and so on. 


(A): Jo compute FOLLOW W ) f or all non- 
r nlls 4. apply the following rules until nothing can be 

Cany FOLLOW set. 

1 place S in FOLLOW(S), where 5 is the start symbol 
and Sis input right end marker. 

, |fthere is a production A -* afi/3, then everything in 
F | R ST (J3) except e is placed in FOLLOW (5). 

1 If there is a production A —> aB or a production 
' j aBP. where FIRST (/3) contains e, then every¬ 
thing in FOLLOW (A) is in FOLLOW (B). 

[sample: Consider the grammar 
E->TE' 

E^+TE'/e 
UFV 
T'^*FT'/e 
F-i (E)/id. Then 

FIRST (£) = FIRST (T) = FIRST (F) = {(, id) 

FIRST (£')= { T , f} 

FIRST (T)= e} 

FOLLOW (£) = FOLLOW (£') ={),$} 

FOLLOW (T) = FOLLOW (T) = {+,). S} 

FOLLOW (F) = {*,+,),$} 

^ for the construction of predictive 

to*Stable 

^^ach production A —> a of the grammar, do steps 2 

2 For' 

3. | f r f each ,er minal a in FIRST (a), add A-*a to M [/I, a] 
C" F1RST add A a to A/ [A, b ] for each 

is m Foi ln F0LL0W M). If f is in FIRST (a) and S 
’• \laic e e 0\V (,4) ? a dd ,-t a to M [A, S] 

“ c undefined entry of M be error. 
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the followih ^ ICSc ridcs t° the above grammar, wc v/ill g ct 
0ll0VVl ng parsing tabl e. 



he parser is controlled by a program. The program con 
. Ccr -t, the symbol on top of the stack and V/ the current 
,n Put symbol. 


1. If-V = a - S, the parser halts and announces successful 
completion of parsing. 

2. If a = a * $, the parser pops x off the stack and advances 
the input pointer to the next input symbol. 

3. Ifx is a non-terminal, the program consults entry' A/[x, 
a] of the parsing table M. This entry will be either an 
^-production of the grammar or an error entry. If A/[.v, 
a ] = {x —> t/KJF}, the parser replaces x on top of the 
stack by WVU with U on the top. 

If M[x s a] = error, the parser calls an error recovery routine. 

For example, consider the moves made by predictive 
parser on input id + id * id, which are shown below: 


Matched 

Stack 

Input 

Action 


E$ 

id+id’id$ 



TE'$ 

id+id’id$ 

Output E -> TE' 


FT'E'S 

id+id*id$ 

Output T —> FT' 


idT'E'S 

id+id’id$ 

Output F —> id 

id 

T'E'S 

+id*idS 

Match id 

id 

E'$ 

+id’idS 

Output T'—> e 

id 

+TE'$ 

+id’idS 

Output E'—> +TE' 

id+ 

TE'$ 

id’idS 

Match+ 

id+ 

FT'E'S 

id’idS 

OutputT— > FT' 

id+ 

idT'E'S 

id’idS 

Output F —> id 

id+id 

T'E'S 

’idS 

Match id 

id+id 

•FT'E'S 

’idS 

Output T' -> •FT' 

id-hid* 

FT'E'S 

id$ 

Match’ 

id+id’ 

idT'E'S 

idS 

OutputF -> id 

id+id’id 

T'E'S 

S 

Match id 

id+id’id 

E'S 

$ 

Output T'-» c 

id+id’id 

$ 

S 

Output E'— » e 
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BOTTOM OP PARSES ^ ^ <n iiiim| slr . |(! 
• This parsing constii'cla j- * j lip lo wnrds (lie mol. 
beginning "' 0 , ,C “ V ^ " IMir sinfi is sliill-rcducc parsing. 


1 


Clone 


nil slvloorhoilont-np parsing 


Reduce Parsing 

jini nl'thc ornmmnr. II sinui- 
Reduce a siring to the start symbol ol thc gn 

hites the reverse of right most ‘ . js mnlc hcd (in left 

in every step n pan^.*5^,^reduction and (hen 
right fashion) to the rig * SU L ' . j ^ j ian{ j side of 

j,is substituted by the non-temunal ... the lell 

the production. 

For example consider the grammar 
S —> a A Be 
A —> Abc/b 
B —^ d 

In bottoniup parsing the string ‘abbede’ is verified as 

abbede 
aAbede 

aAde !> —> reverse order 

ciABc 
S 

Stack implementation of shift-reduce parser 

The shift reduce parser consists of input buffer, Stack and 

parse table. . . 

Input buffer consists of strings, with each cell containing 

only one input symbol. 

Stack contains the grammar symbols, the grammar sym¬ 
bols are inserted using shift operation and they are reduced 
using reduce operation after obtaining handle from the col¬ 
lection of buffer symbols. 

Parse table consists of 2 parts goto and action, which are 
constructed using terminal, non-terminals and compiler items. 

Let us illustrate the above stack implementation. 

—> Let the grammar be 
S —} AA 
A -» aA 
A —> b 

Let the input string ‘tv’ be abab$ 

<y = abab$ 


Rightmost derivation 

lor bottom up parsing, 
in reverse. 0r c usi niI r 

8 ri eht 

Handle of a stria# Substr 
some production and whosc'^ ^tci, 
on (lie LIIS 

derivation. c rcv crs c c 


1^0 




S 


<*Ar 


: of 


«/)r 


Hi 


"h! 


Right sentential forms of a unamb 1 
one unique handle. *8uoug 


8r, %. 


Example: For grammar, S 
A —> Abc/b 
B —> c/ 

S => a A Be => aAde => a Absj.de 
Note: Handles arc underlined. 


aAde 


^ a bbede 


Handle pruning The process of discovering a h- 
reducing it to the appropriate left hand side is iT? 
die pruning. Handle pruning forms the basis forab^-^' 
parsing. 

To construct the rightmost derivation: 

S = r 0 => r t => r 2 -=> r =w 

Apply the following simple algorithm: 

For i <— n to 1 

Find the handle A. —> B t in r ( 

Replace B. with A t to generate r w 

Consider the cut of a parse tree of a certain right rgc 
form: 

S 



Stack 

Input String 

Action 

$ 

abab$ 

Shift 

$a 

bab$ 

Shift 

$ab 

ab$ 

Reduce (A -» b) 

$aA 

ab$ 

Reduce (A -» aA) 

$A 

ab$ 

Shift 

$Aa 

b$ 

Shift 

$ Aab 

S 

Reduce (A -> b) 

SAaA 

S 

Reduce (A -> aA) 

SAA 

$ 

Reduce (S-> AA) 

S 5 

s 

Accept 


Here // —>)3 is a handle for afio). 

Shift reduce parsing with a stack There art. - F' 
with this technique: 

(i) To locate the handle 

(ii) Decide which production to use 


ck uflh* 


hac^ 


General construction using a stack 

1. ‘Shift’input symbols onto the stac 

found on top of it. Ains^^ 

2. ‘Reduce’ the handle to the corresp° n 

3. ‘Accept’ when the input is consume 
symbol is on the stack. 

4. Errors - call an error reporting/r eC ° v 




Scanned by CamScanner 
















fhc * ct °f Prefixes of a right . 
.Appear o»i thestnek of a shift ^ 

ftsSf-— 

Ik 1 ' 


Civ 


sentential 
c Parser 


*-ct the 


i:, Pter 


"Perator, 
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c° l 


/ 


<p 


Conflicts 


shi ft/reduce 

conflict 


Rcduce/reduce 

conflict 



id 

* 

y 


Jd_ 


> 

> 


+ 


> 


> 

X 

< 

> 

> 

> 


< 

& 

< 

accept 


-dirt* 


conflict 

^'^lT' he, ’ Slml|ifeXPrlhe " sl ™el S0 

can’t tell 


15 


ftJ n>P |c ‘ lhe r statement 

iifl'I^ihen stmt is on the stack, in this case we 
' |fe»P . a handle, i.e., ‘shift/reduce’ conflict 

^reduce conflict 




JtJC £ 


pa* 11 ! 


pie: 


aAlbB 

A-* c 

B^> c 

\\i = ac it gives reduce/reduce conflict. 


non-terminals at the right side. 


ratof Precedence Grammar 

^ erator grammar, no production rule can have: 

at the right side. 

.two adjacent i 

i i. f £ + £/£-£/ id is operator grammar. 
Example l. ^ 

■ 2 : E-^AB 1 

A Cl ^ not operator grammar 
B-^b 


Example: 


Example 3: £ -> EOE/id^ 

not operator 
grammar 


Precedence relation If 
o < i then b has higher precedence than a 
a=i then b has same precedence as a 
a > b then b has lower precedence than a 

Common ways for determining the precedence relation 
between pair of terminals: 

'• Traditional notations of associativity ( 0 r) + < * 

Example: x has higher precedence than + 

“• P'tst construct an unambiguous gramma an( j p re c- 
guage which reflects correct associate ty 
e dence in its parse tree. 

^Wator precedence rslotions from 

j ° Ss °ciativity and precedence c <c 6 and ^ 

>R Use S to mark end of each string- DR 
K 0r all terminals b. Consider the gramm 

£_4£ + £/£*£ /id 


Sea 

2, ■j'] lc ^ 1C str ' n £ from left until > is encountered 

n scan backwards (to left) over any = until 

encountered. 

he handle contains everything to the left of the first > 
an to the right of the < is encountered. 

her inserting precedence relation is 
Sid + id * id S is 
S<id> + <id>*<id>S 

Precedence functions Instead of storing the entire table ol 
precedence relations table, we can encode it by precedence 
functions f and g, which map terminal symbols to integers. 

1- J\a) < J[b) whenever a < b 
2. J[a) >j[b) whenever a £ b 
3- j\a) >Jlb) whenever a> b 

Finding precedence functions for a table 

1. Create symbols J[a) and g(a) for each ‘a’ that is a ter¬ 
minal or S. 

2. Partition the created symbols into as many groups as 
possible in such away that a = b then f (a) and g (b) are 
in the same group 

3. Create a directed graph 

If a < b then place an edge from g(b) to/(a) 

If a > b then place an edge from /(a) to g{b) 

4. If the graph constructed has a cycle then no precedence 

function exists. 

If there are no cycles, let f(a) be the length of the long¬ 
est path being at the group of /(a). 

Let g(a) be the length of the longest path from the 

group of g(a). 


Disadvantages of operator 
precedence parsing 

. n can not handle unary minus. 

. Difficult to decide which language is recognized by 
grammar. 

Advantages 

2 . SSu ™>“ 6 h ft* in prosrammins 

language. 

£rn> w “Nation holds between the terminal on the top of 

St3Ck a n J L found, but there is no production with this 
r, t \ handle is> ^ . An 


handle as 


the right side. 
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C rn ,r m«"'«T is Hlk.l w' 11 ’ * . .'. . 

'■ :r ' , ..recover ft... 

Jrecovcr.wc.-.. ( i - r . / c"r"’^ 

1. Slack or 

2. Input or 

5- »« ,h eel into an infinite loop. 

We must be careful 'fat we Jon' 8 C| 1 


LR Parse! s Scanning, R stands 

. In LR (AO, L stands lor LclUo R t ber 0 f look 

for Right most derivation, A stands 


ahead symbols. . non-rccursivc 

LR pursers are .able-*,von. muc Ukc of 

LL parsers. A grammar which is used 
LR parser is LR grammar. For a grammar to be LR t 
sufficient that a left-to-right shilt-rcducc h 

to recognize handles of right-sentential forms when y 
appear on the top ot the stack. . 

The Time complexity for such parsers is O (n ) 


• LR parsers are faster than LL (1) parser. 

• LR parsing is attractive because 

■ The most general non-backtracking shift reduce parser. 

■ The class of grammars that can be passed using LR 
methods is a proper superset of predictive parsers. LL 
(1) grammars c LR (1) grammars. 

■ LR parser can detect a syntactic error in the left to right 
scan of the input. 

• LR parsers can be implemented in 3 ways: 


1. Simple LR (SLR): The easiest to implement but the 
least powerful of the three. 

2. Canonical LR (CLR): most powerful and most 
expensive. 

3. Look ahead LR (LALR): Intermediate between the 
remaining two. It works on most programming lan¬ 
guage grammars. 


Disadvantages of LR parser 

1. Detecting a handle is an overhead, parse generator is 
used. 

2. The main problem is finding the handle on the stack 
and it was replaced with the non-terminal with the left 
hand side of the production. 


The LR parsing algorithm 

• It consists of an input, an output, a stack, a driver program 
and a parsing table that has two parts (action and goto). 
The driver/parser program is same for all these LR pars¬ 
ers, only the parsing table changes from parser to another. 



Stack: To store the string of the form, 
c v S ... x S where 

oil mm 

S : state 

m 

x : grammar symbol 

Each state symbol summarizes the informationc 
the stack below it. Cot "air«r ; 

Parsing table: Parsing table consists of t\vo pans . 

1. Action part 

2. Goto part 

ACTION Part: 

Let, S m -» top of the stack 
a —» current symbol 

Then action [S^, a ] which can have one of four values: 

1. Shift S, where S' is a state 

2. Reduce by a grammar production A -> 0 

3. Accept 

4. Error 

GOTO Part: 

If goto (S, .4) = A' where S —> state, A —> non-terminaltic 
GOTO maps state S and non-terminal A to state X 

Configuration 

(S o x l S l x 2 S : -x m S m , aa^-aS) 

The next move of the parser is based on action 
The configurations arc as follows. 

1. If action [S , a] = shifts 

(S -VjSj .v,S, x m S n , aa H - 

2. If action [S m , a.] = reduce A —> P ^ en 

Where S = goto [S„ : „, A] 

3. If action [S , a] = accept, parsing is comply ^ 

4. If action [S„, »] - error, i. calls an 

routine. ; 

Example: Parsing table for the following ^ 
shown below: 

1 . E —> E + T 2 . ^ 

; M 
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>1 

t 


b - P^iti 





( ) 




got o 
G 
1 


T 

2 


w e 

r. 


S; 

f. 


ncc 

r ? 

r. 


3 

i 

5 

6 

5 

o 

10 

11 


S, 

s, 

S 4 


s e 

7, 

^3 


S, 

r. 


3 

10 


n R parser on input stiing id*id'Hd is shown below* 
Mo^ 0,L ‘‘ 


Input 


Action 


i-^lcnl Analytl* ^nci Paring | 3-tt l J 

'■ } ff ('i) |t 

^ n *ulhnp c ? ^ ' UVo Hccn un input string derivable 
^ ^ ^ ^ (0) ii * SC< ' H,rin ^ derivable' from Cl>■ 
l,Mr,() recouni CmS are C(n ‘dm< tal as n 1)1 A from gram- 
„>c 

1 l,c I R bC Vicwc ‘ l as th <5 Htalc% of MI A. 

'ides iu i • " c,n (,,r > canonical I.R (0) collection. r' r< >- 
,e I’DS for conslroclin^ SI .R piir«r. 

!”(0) items, «». 

11 uugmented grammar 
Ul) c| osurc and goto 

AuRinentctl grummur (G) If G is a grammar with start 
symbol S % (/ t|\c augmented grammar for (7, with new *>tart 
symbol S' and production SI —> S. 

Purpose of C/ is to indicate when to stop parsing and 
announce acceptance of the input. 

Closure operation Closure (/) includes 

1. Inlially, every item in / is added to closure (/) 

2. If A —> a.Bfi is in closure (f) and /3 —> y is a production 

then add B —> .y to /. 


Sta^_— 

id * id + id$ ! 

: o 

0id5 

• id + idS 

| 0F3 

* id + idS 

i 0T2 

* id + idS 

[ 072 *7 

id + idS 

: 072 * 7 id 5 

+ id$ 

[ 072 * 7 F10 

+ idS 

t 072 

+ id$ 

. 0£1 

+ idS 

\ 0E1 + 6 

idS 

:• 0£1 + 6 id 5 

$ 

[. 0E1+6F3 

$ 

[ 0E1 + 6T9 

$ 

\ 0E1 

S 

ICEI 

S 


Shift 5 

reduce 6 means reduce with 
6th production F—> id and 
goto [0, F] = 3 

reduce 4 i.e T—> F 
goto [0, T] = 2 

Shift 7 
Shift 5 

reduce 6 i.e F-» id 
goto [7, F] = 10 

reduce 3 i.e T-» T F 

goto [ 0 ,T] = 2 

reduce2i.eE->T&goto[0,£l = l 

Shift 6 
Shift 5 

reduce 6 & goto [6, fl = 3 

reduce 4 & goto [6. "n = 9 

reduce 1 & g° t0 1 0, ^ = 1 
accept 


Goto operation 

Goto (/, x) is defined to be the closure of the set of all it * 
[A -> aX.(5\ such that [/I -> aJ(fi\ is in I. 


Items 


Kernel items: S’ -* -S 
and all items whose 
dots are not at the 
left end 

Construction of sets of Items 
Procedure items (G') 


Non-kernel items: 
Which have their 
dots at the leltend. 


■ .s\})\ 


Begin 

C: = closure (tfo 


goto 


(!,x) to C; 


■ ^ 0nstr ucting SLR parsing table production of rir 

l Ul (0) item: LR (0) item of a grammar J s ‘ ^ duct ion. ( £)/id 
w 'th a dot at some position of the n S 1 - below 


Until no more sets of items can 

Example: LR(0) items for the grammar 

E'->E 

e^e+tit 

T* piF 


be added to C, end. 


W*'_ 

|T. 0i sible LR (0) items are 

m A ^.BCD 
m \*B.C D 

I 

W^bcd. 


Sample; A -> BCD 


is given below: 

>i E C C r 

E^.E + t 

e^.t 




■ 



Scanned by CamScanner 















3.816 I Compiler Design 


r~* ./•' 
£-m£) 

£-4 .id 

V g° ( < 7 ,.- 
£' -> /:'. 

e -» + r 

/,: goto (/„. 7) 
£-4 7'. 

£-4 7". * £ 

/,: goto (7 0 ,70 
7 —»£ 

7 4 : goto (7 0 , ( ) 

£-*(•£) 
£—».£+ t 

£-4 .7 
£-4.7*£ 

r^.F 
F -4 .(£) 

£-4 .id 

7 5 : goto (7 0 , id) 

£-4 id. 

7 6 : got (/,, +) 

E^E+.T 

7->.7*£ 

£->.(£) 

£—».id 

ly goto (7 2 , *) 
T^T* .F 
F—> -(E) 

£-4 .id 

7 g : goto (7 4 , E) 
£->(£.) 

4 : goto (/ 6 , 7) 

£ -4 £+ £ 

7-» 7* £ 

7 I0 : goto (/,, £) 

7-4 7 + £. 

7 n : goto (7 S ,)) 

£-4(£). 



For viable prefixes construct the nr a 

mc i-o A as fo|| 



0\v S: 


(0; 


■ 

. 


SLR parsing table construction 

1. Construct the canonical collection of sets of [r 
items for G'. 

2. Create the parsing action table as follows: 

(a) If a is a terminal and [A -4 a.afj] i s j n / eo , o 
(7, a) = Ij then action (/, a) to shift j. Here 
be a terminal. 

(b) If [A -4 a.] is in I., then set action [/, a] to ‘reduce 
A —> a' for all a in FOLLOW (A); 

(c) If [y -4 £.] is in 7 then set action [/, S] to ‘accept’. 

3. Create the parsing goto table for all non-terminals A. if 
goto (7, A) = 7 then goto [/, A] = j. 

4. All entries not defined by steps 2 and 3 are made errors. 

5. Initial state of the parser contains S -4 5. 

The parsing table constructed using the above algo¬ 
rithm is known as SLR (1) table for G. 

Note: Every SLR (1) grammar is unambiguous, but every 
unambiguous grammar is not a SLR grammar. 

Example 6: Construct SLR parsing table for the following 
grammar: 

1. S—>L = R 

2. S—>R 

3. 

4. Z. —4 id 

5. R-+L 

Solution: For the construction of SLR parsing table, 

S' -»£ production. - 

S'^S 
S—>L = R 
S R 
L^>*R 
L id 
R—>L 




> •* v v ; •>> 
, v . : * 

' ~ -V.rvV'V.^vi ; 

v *.-• ^>3 

. 

■ . - ■ / . ./■ vi&fca 

■ 

■ ■ 
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f'* 5 


will be 


5'* >t 


R 


K 


oto (/.•« 


l-co 

r c 

,. goto (4’ ^ 
n . 

j- got(4* ^ 
S-)R- 

I ■ goto (4' ) 

l** A 
l\-*L 
L->*R 
t-K id 
/.: goto(/ 0 , id) 
L —> id. 

I- goto(/ 2 , =) 
S-iL = .R 

r->.l 

I —> .id 
goto(/ 4 , if) 
L->*R. 

If goto(i 4 ,1) 
R^L. 

goto(/ 6 , if) 
S->L = R. 


The DFA of LR(0) items will be 
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Action 



Id 



acc 


2 Vs 

3 

4 

5 

6 

7 

8 
9 


8 7 
8 9 


p P . (1 

OjM 

m 


FOLLOW (S) = {$} 

FOLLOW (L) = {=} 

FOLLOW ( R ) = {$, =} 

For action [2, =] = S 6 and r 5 . . nQt 

Here we are getting shift — reduce conflict, so it 
SLR (1). 

Canonical LR Parsing (CLR) 

• To avoid some of invalid reductions, the states need to 

carry more information. . . 

• Extra information input into a state by including a terminal 
symbol as a second component of an item. 

• The general form of an item 

[A -i o.ft a] 

Where A —> a/3 is a production. 

a is terminal/right end marker ($). We will call it as LR 
(1) item. 

LR (/) item 

It is a combination of LR (0) items along with look ahead of 
the item. Here 1 refers to look ahead of the item. 

Construction of the sets of LR (1) items Function closure (I): 

Begin 

Repeat 

For each item [/l -» a.B/3, a] in I, 

Each production B -» .y in G\ 

And each terminal b in FIRST (j3 a) 

Such that [B -» .y, b] is not in I do 
Add [B -».y, b] to 7; 

End; 

Until no more items can be added to 7; 

Example 7: Construct CLR parsing table for the following 

grammar: ® 

S' ~) S 

s —>cc 

C —> cCId 
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< the nh Inn flrrlviilioii 


s„,„, ... 

/: V >.■*>'. * 
v S 

riBl(/Ww hrs 7 V , 

So. skM it*-*nis | < 

.('( , (/1 

Our r»i>it set / 0 : A 

s-> 

C -~> .coca, c/d 
C —> .<A dd* 

/,:poto(/ 0 ..V)il'.V=.V 

5^ *—^ *S.« 

A: goto (7 0 , O 

s-*c.c s 

C->.cC$ 

C->.</. $ 

A: goto (/ 0 , r) 

C —> c.C, c’/rf 
C —> .rC, r/rf 
* .d c/d 


/ 4 : goto (/ 0 , d) 
C —> </.» eVt/ 

/ 5 : goto (A, O 
5 4 CCVS 

/ 6 : goto (A, c) 

C —> ..cC, $ 
C-*..<AS 

A: goto (/„ </) 
C-wAS* 

/ s : goto (/ 3 , Q 
C —> cC., c/d 

Ir S oto (4 O 

C —> cC., $ 
CLR table is: 


Action 


States 


Goto 


acc 


S > CC 

ntnrk 

0 

Or/1 

OGS? 

ocace 
ocacnr/y 
oc^coco 
o c/aco 
051 




^itt? 


ICxunipIc H: Construct CLR parsing table f ( 

s->i. -it 

S -> R 
L-> *K 
!.-><(! 

R -> L 

Solution: The canonical set of items is 
/„: y -> .5. S 
S->.L = R,S 
S —> .R, $ 

L -> .* R , = 

A —> .id, — 

R —> .A, S 


'Nt 4 


rnf| wr 


°Mhc 


♦cc 

‘■'v 


[first (= R$)= {=)] 


/,:got (I 0 ,S) 

S' ->5„ S 

A: goto (/ 0 , A) 
S->L. = R, S 
/? —> A., S 

/ 3 : goto (/„, A) 

5 —> /?., S 

got (/„, *) 

A —>*./?, = 

7? —> .A, = 
A->.*7?,= 

A —> .id, = 

7 5 : goto (/„» id) 

A —> id.,= 

V- goto (/„ A) 
—> A., $ 

/ 7 : goto (/,,=) 

5 —> A = ./?, 

^ > .A, $ 

^ -> •**, S 

1 -»-id, S 

7 8 : gOtO (/ 4 , /?) 

A -» */?., = 
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/:£ 

h 


CO 


0\o(L- L) 


, /... = 


/ = *..$ 


/)l : - 

l- 

/? 

i- 

L 


C 0t0 0- ** 
*./?.$ 
.Z..S 
.*/?,$ 

.id. S 


• goto id) 
id.. S 


/ • uoto (/,,. ft) 

'u c 




*ft.. S 


Ch 
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JStack^ 

Input 

0 

ld = idS 

Oid 5 

= id$ 

0L2 

= IdS 

0L2 = 7 

idS 

0L2 = 71 d!2 

$ 

0L2 = 7L6 

$ 

0L2= 7 RIO 

$ 

0S1 (accept) 

$ 


^ 1 * gramrnar is LR (1) grammar. 

(I) will have ‘more number of states’ than SLR Parser. 



We have to construct CLR parsing table based on the above 
diagram. 

In this, we are going to have 13 states 

The shift -reduce conflict in the SLR parser is reduced 

here. 


LALR Parsing Table 

The tables obtained by it are considerably smaller than 
the canonical LR table. 

* LALR stands for Lookahead LR. 

* The number of states in SLR and LALR parsing tables for 

a grammar G are equal. 

But LALR parsers recognize more grammars than SLR. 

YACC creates a LALR parser for the given grammar. 

YACC stands for ‘Yet another Compiler’. 

An easy, but space-consuming LALR table construction 

is explained below: 

1. Construct C= {/ 0 , / ( , -/^}, the collection of sets of LR 
(1) items. 

2. Find all sets having the common core; replace these 
sets by their union 

3. Let C = { J 0 < J, — JJ be the resulting sets of LR (1) 
items. If there is a parsing action conflict then the 
grammar is not a LALR (1). 

4. Let k be the union of all sets of items having the same 
core. Then goto (J, X) = k 


States 


0 

1 

2 

3 

4 

5 

6 


id 


S, 


S, 


s L FI 

1 1 3~ 


acc 


S, 


9 8 


10 


• • J .„iATDm —men me grammar 

is said to LALR (1) grammar. 

. The collection of items constructed is called LALR m 

collection. 

E Mm ple 9: Construe, LALR p arsing m 
following grammar: ,ur tnc 

S' S 
S-+CC 
C -> cC!d 

Solution: We already got LR (1) items 

table for this grammar. u '-LR parsing 

After merging 13 and 16 are replaced by I3 6 
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/ : Ry merging /, :»1<I /, 


/ / mill 4, .'in-' repined! by 

rl*Vr.. <’/,//$ 


rhc lai.R parsing .able for .bis grammar is given below: 


-- 


Action 


goto 

State 

c 

d 

s 

S c 

0 

s M 

S.r 


1 2 

1 



acc 


2 

S M 



5 

36 

S* 



89 

47 


*3 

f 3 


5 



r i 


89 

r 2 

r z 

r 2 



Example: Consider the grammar¬ 
s'-^ 

S —^ a Ad 
S —> bBd 
S —> ciBe 
S —^ 

A —> c 
B c 

Which generates strings acd bed ace and bee 

LR (I) items are 

1 Q : S' —> .S y $ 

S —^ .aAd y $ 

S —> .bBdy $ 

S —> >ciBcy S 
S —> .bAcy S 

goto (I 0 , S) 

S^S.,% 


l v goto (/„, a) 
S a.Ad, c 

S —> a.Be, c 
A -> .c,d 
B —> .c,e 
^ goto (/ 0 , /;) 

•S —> /j.5r/, c 
b.Ae, c 

d —> .C, £> 

^ -> .c, e 


4 : g°'o (/„/!) 



/,: goto (/j, B) 
S —> aB.Cy c 


I A 8 0, ° c * 

/I —> c.. r/ 

B -> c., c 


/ 7 : goto (A,, c) 
// —> c., <? 

B —> c., r/ 


/ : goto (/ 4 , r/) 


.S' —^ a Ad., c 


goto (/,, e) 
5 —. aBe., c 


If we union 7 (i and /, 

/I —» c., die 
B —» c., die 

It generates reduce/reduce conflict. 


Notes: 

I. The merging of states with common cores 


produce a shift/reduce conflict, because shift* ^ 
depends only on the core, not on the lookahead aCt '° n 
2. SLR and LALR tables for a grammar always hav 
same number of states (several hundreds) whe ** 
CLR have thousands of states for the same grami^ 


Comparison of parsing methods 


Method 

Item 

Goto and 
Closures 

Grammar it 
Applies to 

SLR (1) 

LR(0) item 

Different from 

LR(1) 

SLR (1) clR(i) 

LR (1) 

LR(1) item 


LR(1) - Largest 
class ol LR 




grammars 

LALR(1) 

LR(1) item 

Same as LR(1) 

LALR(l)cLRil) 



Every LR (0) is SLR (1) but vice versa is not true. 


* 


pr* ( 


c* 1 


pf. 


C c 


5' 

S' 

Ti 

(•' 

If 

(< 

(' 


Difference between SLR, LALR 

and CLR parsers , 

Differences among SLR, LALR and CLR & dl5CU> 
below in terms of size, efficiency, time and sp 3ce< 
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Comparison of paring methods 


-faster 1 LexkaS Anifysb and P^rv 


. ; n£ I 


3.8^1 


Factors 


Size 

Method 



t-At-R Pym 



t» based m -or. 

function 1 r -" r*** * aactfcabte This -5 ~cs: co"***~ 3 

3 Syntactic features Less expo-.-. to «-der ctass s ,_- SLBar<tLAi-H 

to ether LR 

, Error detection . 

•*■ n* 0 * irnmedaie 

, Time and space u,. t; _„ 
b- ; t.rre ar.d 33 a 


Vcr x 4 trert are 
^xsressa^ 

*'-* TT*acfate 
L'c-e time and scare 
***y 


Exercises 



practice Problems I 

n reC tions for questions I to 15 : Select the correct alterna¬ 
te from the given choices. 

, Consider the grammar 
S^a 
S^ab 

The Given grammar is: 

(A) LRO) only 
IB) LUDonly 
(C) Both LR(1) and LL(1) 

ID) LR 0) but not LL (1) 

2. Which of the following is an unambiguous grammar, 

that is not LRfl)? 

(A) S —^ Uab 1 Vac 
U-*d 
V->d 

(B) S^Uab/Vab/Vac 
L —> <7 

V-> cl 

(C) S-*AB 
A -> a 
B->b 

•D) S —> Ah 


A. If an LL (1 > parsing tab! 


kconstruced for *c above 

srayfcrlS-^n* 8 

iBi 

tD> L ->153 

5 to 7: Consider the aug- 


grammar. the parsing table i 

(A) S—> T: 5 
(C) T —* i'R 

Common data for questions 

mented grammar 
S^X 

X —» (X) ! a . 

5. If a DFA is constructed for the LR (U items ot 


II a UTrV 1* LUllbuutiwu - ' 

above grammar. then the number states pre-v 


(B) 9 
I.D) 10 


ac 


c °ramon data for questions 3 and 4: Consider the grammar 

$^T:Si<= 

r -U7? 

V ^xly/[S\ 

•* .Tie 


R 


! 




3 * Jlfich of the following are correct FIRS 
; 0L LOW sets for the above grammar- e} 

! msus) = FIRST (7) = FIRST (U)-<*- 

'!» first (R) = {,e) 
follow tS) = {].S} 

•’ follow (n = Follow (R) ® w 


are: 

(A) 8 
(C) 7 

6. Given grammar is 

(A) Only LRfl) 

(B) Only LL (1) 

(C) Both LRfl) and LLfl) 

(D) Neither LRfl) nor LL (1) 

7. What is the number of shift-reduce steps for input fa)? 

’ (A) 15 (B) 14 

(C) 13 fD) 16 

8. Consider the following two sets of LR (1) items of a 

grammar. 

X —» cX, c!d X —» cX $ 

X-> .cX, eld X —> .cA', S 

X -» d, cld X -» . $ 

Which of the following statements related to merging 
of the two sets in the corresponding LALR parser iSare 
FALSE? 

1. Cannot be merged since look ahead are different 

2. Can be merged but will result in S-R conflict 

3 . Can be merged but will result in ft - ft conflH 

4 . Cannot be merged since goto on c will lend^ 

different sets. wa 

(A) 1 only (B) 2onlv 

<C) > and4onl >' (D) 1.2,*3and4 

o Which ot the following . 
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1 ^syntax directed translation 
. syntax directed definition 
. Dependency graph 

.Constructing syntax trees (or expresses 
. Types of SDD’s 


S-attributed definition 
L-attributed definitions 
Synthesized attributes on the parser 
Syntax directed translation schemes 
Bottom up evaluation of inherited attributes 



Syntax Directed Translation 

fo rranstoe a proving tanguagc — 

" CCd ancUhe numbedof instnicrions ganeritted...etc. So, we have 
ZXL ‘attributes' memoty 

" which can ' 

no so e ^ 10 d0 ,hiSi u " ' seman,ic 

analysis phase. 


Syntax 

tree 


Semantic analysis 


Semantically checked 
syntax tree 


[„ this phase, for each production CFO, we will give some seman- 
tic rule. 


N Torammar symbols arc associated with attributes. 

2 . Values of the attributes are evaluated by the sating 
associated with production rules. 

Notations for Associating Semantic Rules 

There are two techniques to associate semantic rules: 

Sv ,itax directed definition (SDD) It is high level specification fe 
translation. They hide the implementation details, i.e., the order: 

which translation takes place. 

Attributes + CFG + Semantic rules = Syntax directed defimne: 
(SDD). 

Translation schemes These schemes indicate the order in utid 
semantic rules are to be evaluated. This is an input and output 

mapping. 


Syntax directed translation scheme 

A CFG in which a program fragment called output action (seman¬ 
tic action or semantic rule) is associated with each production 
known as Syntax Directed Translation Scheme. 

These semantic rules are used to 


1. Generate intermediate code. 

2. Put information into symbol table. 

3. Perform type checking. 

4. Issues error messages. 


Iyntax Directed Definitions 

SDD is a generalization of a CFG in which each cram.- 
)1 is associated with a set of attributes. 

There are two types of set of attributes for a gra 

1. Synthesized attributes 

2. Inherited attributes manuc^' 

ich production rule is associated with a set of 


jc?*: 
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. ... t ntic nilcs setup dependencies 

K reptvsented by a dependent 

" ,Vc dcr^ k ' nc > f Jph detennines the ^Uruinn , 

^. semantic ndes. ‘‘'Hatton order 

'* J^luarion of a senun,ic nik ' defines the v-,i 

R' 15 •' semamic nik ' way also have ' . an 
sll eh as printing a value. ' L So, ne side 

vtl ribute grammar: An attribute grantnn • 
j -ected definition in which the functions i„ ’* \ syn,ax 
have side effects'. ' Sc niantic rules 

annotated parse tree: A parse tree showing ti 
acutes at each node is called an annotated m " '° S of 

The process ot computing the attribi trcc - 

aedcs is called annotating (or decorating,JJ alUCS at ,hc 
In a SDD. each production .1 ^ . 10 P arso tree, 

set of semantic rules of the fonn; ' S assoc ' au -’d with a 

*.= ?V . c.c\) where 

.-5 function 

b can be one of the following: 
b is a ’synthesized attribute' of A and 
utes of the grammar symbols in A 1'' aro attrib - 
The value of a 'synthesized attribute' no. , . 
rated from the value of attrihnt" " 3 noik ls coni ' 
node in *e parse tree. ° S a ‘ U “ °f .to, 

Example: 


Production 


expr -»exprl + term 
expr -» exprl - term 
expr -»term 
term -> 0 
term —> 1 


term -> 9 


Semantic Rule 
expr.t:= exprl. t||term.t||'+ 
expr.t: = exprl -t|[term.t||’- 
expr.t: = term.t 
term.t: =‘O' 
term.t: = ‘V 

term.t: = ‘9’ 



term/= 2 
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lAtunplc: An inherited attribute distributes type informa- 
a to t ie various iilentifiers in a declaration, 
i or the grammar 

/) -> tl 
T—v int 
T ~> real 
I- ~W. r id 
/• -> id 

I hat is. The keyword int or real followed by a list of 
identifiers. 

In this T has synthesized attribute type: T.type. L has an 
inherited attribute in L.in 

Rules associated with L call for procedure add type to the 
vpe ol each identifier to its entry in the symbol table. 


Production 

D -> TL 


Semantic Rule 


T-> ini 
T-> real 

Eid 
i- —» id 


L.in = T.type 
T.type = integer 
T.type = real 

addiype L r in = L.in(id.entry, L.in) 
addtype (id.entry. L.in) 


stownbelowf ^^•«, U, is 



b i 


0 inherited attribute’ of one of the grammar symbols 
right side of the production. 

def ° lnbe ri'ed attribute’ is one whose value at a node is 
* n rirms of attributes at the parent and/or siblings of 
n °ds. It is used for finding the context in which it appears. 


Synthesized Attribute 

The value of a synthesized attribute at a n h • 
from the value of attributes at the children^ Corn P u ‘<-’d 
parse tree. Consider the following grammar ™ n ° dC ‘ n a 
L —> £ 

H 

£ -4 £, + T 
E -» T 
T-> T*F 
T->F 
£->(£) 

F —»digit. 

Let us consider synthesized attribute value with , 
non-terminals E, Zand F. each of the 

Token digit has a synthesized attribute lexical s 
by lexical analyzer. u Pplied 
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,,J. I c«*" 


ProditcH 01 ' 

] »"<?„ 

I • ' 1 

r > i 
7 * r;r 
1 >r 
r-*tQ 

f ► digit 


( ionio" ,,R r,ulP 

ni ini <r vnl) 

/ V n. 

r.vni: ■ r,.vm 

r.vnl: - r,.vnl*r.vn' 

r.vni: - r.vnl 
r.vnl. r.vni 
r.vni: - <H<|l" 0X ' /il1 _ 


Dhpenohncy Graph 

I |, c interdependencies amon,, ,|, c 

in » P»,« lr « depicted by a . 

ilepeixlency sraph. Ctl «l lr „’i 

. Synthesized aUri,,u,es l,:,vc edge, n„i , , • 

. inherited attributes have edges pointing U N. E 
<» r sidewise. 1 


& 


"i 


s I 3 * 4 is 


The Annolnlctl,»« «* . .. ' 

shown below: 

D 

/\ 

Cvnl-17 rotnrn 
EmC 5 ^ rval ”12^ 

T vll - 5 rvnl^3 • ^j"" 4 

F vnl 5 r.vnl <■ 3 digit loxval - 4 

I | 

digit loxvol = 5 di g| t .| 0 xvnl •> 3 

Example 1: Consider an example, which shows semantic 
rules lor Infix to posfix translation: 


Production 

Scmontlc Rulos 

expr -> oxprl + torm 

oxpr.t: = oxprl.l||torm.t||V 

expr -> oxprl - term 

oxpr.t: = oxprl .t||torm.t ||'-' 

expr —> torm 

expr.t: - torm.t 

torm -> 0 

term.t: = ‘0' 

torm-> 9 

torm.t :='9' 


number—) number digit 
digit—) OJ11... 9 


digit.val‘O’ 
digit.val T 

digit.val - ‘9’ 

number.val:=numbcr.val * 10 +digit.val 
Annotated tree for 131 is 

131 

numberval 13 * 10 + 


numberval'to r 


digit val 

I 

1 


numberval i 

I 

numb | r val digit-vaUt 

digitval 


Example I: A.tv-J <X.x, Y.y) j s a 
VK l or each semantic rule that consist, r ru| ? f< 

An ' l **Z*-> 

•V 


/ 

Xx 


Y-y 


Example 2: 



Example 3: real p, q\ 


Lin -■= real 

\ 

f type - roal | add typo (g. fea|) 

f-rln ■= real j ' 

add type(P'feal) ldon,ry = ( J 

t 

id entry = p 

Evaluation order 

A topological sort of directed acyclic graph is an ordcir; 
w,, m 2 ,... m k of nodes of the graph S. t edges go fromreS 
earlier in the ordering to later nodes. 


/)i ( —> m 


Example 2: Write a SOD lor the following grammar to 
determine ntimbcr.val. 


means m t appears before m. in the ordering. 

II Ik =./ (c,, c,,..., c k ), the dependent attributesc,,^; 
available at node before/is evaluated. 

Abstract syntax tree 

It is a condensed form of parse tree useful for repress^; 
language constructs. 

Example 

if-thcn-clse 

/ I \ 


B 


S. 


P*' 

IP 

e- 

t' 

i' 

J; 

C° 


Constructing Syntax Trees 
for Expressions ^ * 

f-acli node in a syntax tree can be implement*-d as I 
with several fields. Lyfi 

In the node for an operator, one field id ent ‘ 

,0r ail( l the remaining fields contain pointers to 
the operands. 

I • mknode (op, left, right) i 

2. mklcaf (id, entry). Entry is a pointer to s) - I 

3. nikleaf (num, val) 
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?***!. . 

t« T T 

r T 
r*( fl 
r •« 


f. ..._ 

I'onstnic 


Semantic Hulon 

i nptr: mknode (V, r, j. n)1|l) 

C.np'r • mknodo G,.npi,. T.nptr) 

E.nptrT.nptr 

T.nptr:» E.nptr 

T.nptr := mkleat(ld, Id.enlry) 

T.nptr:» mkleat(num. num.vnl) 


Top -> 


Slain 

Vnl 



Z 

7.z 

v 

y.V 

X 

X.x 


lion of a syntax tree for a - 4 + c 


Example: Consider the following grammar: 



S-4 E$ 

E-> E+ E 
E-> PE 
E-> (E) 

E->l 
I —> / digit 
I -> digit 

Implementation 

S —> ES 
E-+E + E 
E-> PE 
(E) 

e->; 

f—»/ digit 
I -> digit 


(print(E.val)} 


{E.val 

{E.val 

{E.val 


E.val + E.val) 
= E.val * E.val) 
= E.val) 


{/.val val* 10 +digit) 

{/.val := digit) 


print (val [top)) 

val[ntop) := val[top] + val[top-2] 
val[ntop) := val[top) * val[top-2) 
val[ntop] := val(top-l) 
val[ntop) ;= val[top) 
val[ntop] :=10-val[top] +digit 
val[ntop] := digit 


Types of SDD's 

Syntax Directed definitions rsnro „ 

“ d ~~Uons. Th “' »re syn - 

1 • s- Attributed Definitions W f SDD ' 

2. L-Attributed Definitions. 

^attributed definitions 

■ USe “ in symax -WtoWon. L ' attHbuted Definitions 

'■ T! 1 ' a " nbuKS of symbols Af * ... 

A t in the production. r * * *’ ~ f -\ to the left erf 

2. The inherited attributes of 4 


'-■attributed definitions 

• L-°ll 1 u hen ! ed and s y ntlles ' ze d attribute are used, 
associated § ™ mmar support the evaluation of attributes 

edttescan t * , produCtion bod y> dependency-graph 

• :an 80 from Jeft to right ° n ly. 

• L-attr'K * T j S ram niar is also a L-attributed grammar, 
in, ' , Uled o ran amars can be incorporated conveniently 

. ‘ h l ° p down parsing. J 

la ^ ammars interact well with LL (K) parsers (both 
nven an d recursive descent). 

V^thesized Attributes on the 
parser Stack 

merited u!+h \°!-, * m ^-attributed definition often be imple¬ 
mented K,' 1 LR parSer generator. Here the stack is imple- 

• e ach - a pair of array state and val. 

* Each Tin ^ ' S P °' nted t0 a LR I *) parsing table, 
with *1 3 * bc dds the value of the attributes associated 

le n °d e - For A —» \yz, the stack will be: 


Every- S-attributed definition i s 1 -m k 

,W ° rU " Sapp,y °" ly > 0 the inheritodan >he 

tributes. 

Syntax directedTra Nslati 
Schemes SLat 'on 

A translation scheme is a CFG m n- 
ciated with grammar symbols ^ ' ch au nb utes 
enclosed between braces 11 ' . Sei «aniic bo¬ 
rides of productions. ' 1 Jre ,ns «ned with^.ore 

li -»op T {print (op.lexeme)} R j e 

T—> num {print (num.val)} 

Using this, the parse tree for 9 - s u_ -> 

■ + 2 is 
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3.032 | Connor OoO,n 


n 


/j 

(prlnl('O')) 


fh^ 


[(prime Y\‘\' - -7?,-.. 

/ \ j. (primei')) 

J I 2 \ 

(prlnl('5')) |prlnt(‘2')} 


Tlius we will evaluate all semantic actions d • 
and we find a place to store an inherited attrib^ re Sj p 

I. Hemove an embedding semantic acti 

non-terminal M t instead of that semanr" Sf Pu t 
Put S t into the end of a new producti 0n ' C acli °n. ^ 
Semantic action .V will be evaluated 
production rule is reduced. ncn thi s n *‘ 

Evaluation order of semantic rules is not eh 


2 . 

3. 


If we have both inherited and 
have to follow the following rules: 


synthesized attributes then we 


, An ..-il« "" "®. rl8l “ 

0 fa production must be computed in an action before ^ _> G (.S, 

that symbol. r 

2. An action must not refer to a synthesized attribute ol 

a symbol on the right side of the action. 

A synthesized attribute for the non-terminal on the left 
can only be computed aflcr all attributes it references, 


A^{S l )X l [S 2 )X r ..{S) X 

n n 

Aflcr removing embedding semantic actions: 




*,if 


M 2 —> £ {S 2 \ 


M->e [S\ 


can only be computed 
have been computed. 


Note: In the implementation of L-atlributcd definitions dur¬ 
ing predictive parsing, instead of syntax directed transla¬ 
tions, we will work with translation schemes. 

Eliminating left recursion from 
translation scheme 

Consider following grammar, which has left recursion 
E—>E+T {print (V );} 

When transforming the grammar, treat the actions as if they 
were terminal symbols. After eliminating recursion from 
the above grammar. 

E-> TR 

R->+T {print (V);} R 

/?-> e 

Bottom-up Evaluation 
of Inherited Attributes 

• Using a boiiom up translation scheme, we can implement 
any L-attributed definition based on LL (I) grammar. 

• We can also implement some of L-attribulcd definitions 
based on LR (!) using bottom up translations scheme. 

The semantic actions are evaluated during the reductions 
• During the bottom up evaluation of S-attributed defi- 

attribut’es VC ^ & Paralld S ‘ ack '° h ° ld s y nthes >zed 

He&),n810 inherited attributes 

10 “ equivalent grammar to 

All embedding scnvim;. 

S'* nioved to the end ofT* 5 7" ' ranshuion scheme 

‘ All,n heritcdattributeswi l‘ pr ° du<;lion rules. 
aUnbU,CS {ma y ^ new no„!S!?,j M ° “* Vn!hrai «d 


For example, 

TR 

R—> +T {print (*+’)} R, 

R->e 

T id (print (id.name)} 

11 remove embedding semantic actions 
TR 

R -> +TMR { 

R —> € 

T— > id {print (id.name)} 

M —> e {print (‘+’)} 

Translation with inherited attributes 

Let us assume that every non-terminal A has an inherited 
attribute A.i and every symbol A'has a synthesized attribute 
X.s in our grammar. 

For every production rule A AT,, X y ..X n , introduce 
new marker non-terminals 

A/,, M v ...A/ and replace this production rule withA -» 

a/ 1 a' 1 a/ 2 a'...a/a; 

The synthesized attribute of X. will not be changed. 
The inherited attribute ofAf will be copied into the syn¬ 
thesized attribute of A/, by the new semantic action added at 
the end of the new production rule 

A/,-4 6 

Now, the inherited attribute of X. can be found in the 
synthesized attribute of A/ 

A -> {B.i =yj(..) b { c.i =/ 2 (..)} c {A.s =//• •)} 



11 

A •)} A/, {B.i = M r s} B {A/ r r=/ 2 (- -)>H 

kd = M y S\c{A.s=f i (..)} 

A/ ' e = 
e {M v s = MJ} 
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r r>»- 1 

t ( * Ct ' C lfor<l ,ll ' stionS 1 W ,J: Sclcct ‘he correct m, 

aivcncho.ee*. ,,l,cr »a- 

"'"^anno.n.edtrc' fo, input ((«) + (% f 

l-^n below «s__ l " c,l 'lcs 

Semantic Rule 


£^' T 

E- T 

id 

num 


$ $ = mk no^T^T^- 
$ $ = mknode (*-. $1, $ 3) 
$$ = $ 1 ; 

$ $ = $ 2 ; 

$ S = mkleaf (id, $i) 

$ $ = mkleaf (num, $i) 


(A) 


E 

I 

7^ 

( E 

E + 


T 

,/N 


E 

I 

T 

I 

id = a 


E 

1 

T 

I 

id = b 


) 



,C) A 

E + T 


(D) None of these 


• ptcr 2 Syntax Directed Translation | 3-833 
*’• Which of u„. r „ . 

tio„ r „| L , s K Allowing productions with transla- 
dccinvi| CS C ° nvcr,s binary number representation into 


^°ductlon 

Semantic Rule 

B-> 0 

B.trans = 0 

8->i 

B. trans = 1 

b->b 0 

B r trans = B 2 .trans’2 

b->b, 

B r trans = B 2 .trans *2+1 


Production 

Semantic Rule 

B->o 

B.trans = 0 

B-,B 0 

B r trans = B z .trans*4 


Production 

Semantic Rule 

B —> 1 

B.trans = 1 

B-> B, 

B, .trans = B 2 .trans*2 


(D) None of these 
4. The grammar given below is 


Production 

Semantic Rule 

A —> LM 

L.i := l(A. i) 


M.i := m(L.s) 


A.s := f(M.s) 

a->qr 

R i := r(A.i) 


T id = b 
id = a 

2. Let synthesized attribute val give the value of the binary 
number generated by S in the following grammar. 
S-^LL 
S^L 
1->LB 
L —> s 
0 
1 

lnput 101.101, S, val = 5.625 

^synthesized attributes to determine 5.val 

. ' c b °f the following are true? 

lA) 5 L v L 2 {S.val = L,.val + L v val/ ( 2 **L : .bits) 

(R II {S.val = I.val; S.bits = L.bits} 

’ 1 I, B {L.val = L,.val*2 + S.val; 

I-bits = L,.bits + 1} 

\B (L.val = S.val; L.bits = 1} 

C) B -^0 (S.val = 0} 
f II {S.val = 1 } 

D) A » of these 


Q.i := q(R.s) 
A.s := f(Q.s) 


(A) A L-attributed grammar 

(B) Non-L-attributed grammar 

(C) Data insufficient 

(D) None of these 

5. Consider the following syntax directed translation- 
S-> aS {m := in + 3; print (m);} 

\bS {m: = m* 2; print (m);} 

|e {m: = 0;} 

A shift reduce parser evaluate semantic action of a nrn 
duction whenever the production is reduced. 

If the string is = aababb then which of the following 
is printed? ® 

(A) 0 0 3 6 9 12 (B) 0 0 0 3 6 9 p 

(C) 00 03 6 9 12 15 (D) 003 96 12~ 

6. Which attribute can be evaluated by shift reduc 
that execute semantic actions only at reduce 

never at shift moves? 5 out 

( A) Synthesized attribute (B) Inherited attribute 
(C) Both (a) and (b) (D) None of these 
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3 83 4 | Compiler Design 

7. Consider the following nn.u-tntul I ^ + z . num 
A 

. / I % Cnum-num 

U * , 

B nu^ "* n ufT1 | 

num num 

Which of ihc following is true for llic give" jinnolated 

(A) There is a specific order for oval..alien of attribulc 

(B) A n ny’cSu“.on C o'd=r flra, compares an a,iribfiio 
'A' after all other attributes which ‘/l depends on, 
is acceptable. 

(C) Both (A) and (B) 

(D) None of these. 

Common data for questions 8 and 9: Consider the fol¬ 
lowing grammar and syntax directed translation. 


£->£+P 

£ r val = £ : .val + T.val 

£->r 

£.val = T.val 

r-» t*p 

T r val = T 2 .val * T.val * 


P.num 

T^P 

T.val = P.val * P.num 

P->(£) 

P.val = £.val 

P-+ 0 

P.num = 1 


P.val = 2 

p-» l 

P.num = 2 


P.val = 1 

8 . What is £.val for string 1 *0? 

(A) 8 

(B) 6 

(C) 4 

(D) 12 

9. What is the E.val for string 0*0+1? 

(A) 8 

(B) 6 

(C) 4 

(D) 12 


10. Consider the following syntax directed definition: 


Production 


Semantic Rule 


S—>b 

S t l 
1 -> east 
1 north 
f west 
> south 


S.x= 0 

S.y= 0 

S ’ x = S r x+ l.dx 
s -y= S,.y+ i.dy 
l.dx= 1 
ldy= o 

l.dx = o 
^ ‘dy=^ 

l.dx~^\ 

, dy= o 

l-dx = o 
idy*^ 


If Input = begin east south west 
this sequence what will be th P „ ”° nh . 
(A) (1,0) 

®> <o;j: 

? ‘"Put 







(C) 

11. What will be the values .vjc, v v r 

south west’? 1 

(A) (-2.-1) 

(B) (2, I) 

(C) (2,2) 

(D) (3, 1) 

12. Consider the following grammar; 

S -» £ S.val = £.val 
£.num = 1 

£->£*£ £ 1 .val = 2*£ r v a i + 2 , 

£ 2 .num = £ r num + i 
7’.num = £ | .num+ l 

£ —> P £.val = P.val 

Pnum = £.num + 1 
T->T+P P,.v al = f r val + f. va i 

T^-num = r r num + 1 
P num = jT.num + l 
P—> P P.val = P.val 

P .num = P.num + 1 
P ~* (£) P.val = £.val 

. | £.num = P.num 1 

[P.val = / |P.numj 

Which attributes are inherited and which are mb 
sized in the above grammar? 

(A) Num attribute is inherited attribute. Val attributes 
synthesized attribute. 

(B) Num is synthesized attribute. Val is inherited» 
tribute. 

(C) Num and val are inherited attributes. 

(D) Num and value are synthesized attributes. 

13. Consider the grammar with the following translatica J 

rules and £ as the start symbol. 1 

£ —> £,@P {£.value = £ 1 .value*P. value} 

\T {£.value = T. value} 

^ P\ an d F {P.value = P,. value + £.value} 

|£ { T. value = F. value} 

P —> num {£.value = num.value} 

Compute £. value for the root of the parse tree i 
expression: 2 @ 3 and 5 @ 6and4 
( A ) 200 (B) 180 

(C) 160 (D) 40 
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problems 2 

pf ,C fnr Questions I to 10: select the 

C.ii“'P ivcnc 

i'"’ the following Tree: 


correct altcrna- 


Production 

Meaning 

£->£,-* T 

E.t= E r VT.t 

E-> - T 

E.t= E y t + T.t 

E—> T 

E.t = T.t 

/-> 0 

P 

II 

£ 

l-> 5 

T.t = ‘5' 

f-+ 2 

T.t = '2' 

l—> 4 

T.t = ‘ 4’ 


E + T 

/l\ 

E - T 


After evaluation of the tree the value at the root will be: 

(A) 28 (B) 32 

(C) 14 7 

The value of an inherited attribute is computed from the 

values of attributes at the- 

(A) Sibling nodes <B> ^ 

(C) Children node ( D ) Bo,h a 

Consider an action translating expression: 

expr -> expr + term [jS (•!•)) 

expr —> expr - tei m 1K 

expr -4 —> term 

term —> 1 lP nnt ( * 

tenn —> 2 {P nnt( 2 

term —» 3 (print ( . qhove 

Which of the following is true regar mg 

“n^ne expression represen, s * 

notation. . presents prefix 

IB) Action translating expression represen 

notation. 


(C) Action translating expression represents postfix 
notation. 

(D) None or these 

4. In the given problem, what will be the result after eval¬ 
uating 9-5 + 2? 

(A) +-9 5 2 (B) 9-5 + 2 

(C) 9 5-2+ (D) None of these 

5 . In a syntax directed translation, if the value of an attrib 
utc node is a function of the values of attributes oi c i 
dren, then it is called: 

(A) Synthesized attribute (B) Inherited attribute 
(C) Canonical attributes (D) None of these 

6 . Inherited attribute is a natural choice in: 

(A) Keeping track of variable declaration 

(B) Checking for the correct use of L-values and K- 
valucs. 

(C) Both (A) and (B) 

(D) None of these 

7. Syntax directed translation scheme is desirable because 

(A) It is based on the syntax 

(B) Its description is independent of any implementa¬ 
tion. 

(C) It is easy to modify 

(D) All of these 

8 . A context free grammar in which program fragments, 
called semantic actions are embedded within right side 
of the production is called, 

(A) Syntax directed translation 

(B) Translation schema 

(C) Annotated parse tree 

(D) None of these 

9 . A syntax directed definition specifies translation of 
construct in terms of: 

(A) Memory associated with its syntactic component 

(B) Execution time associated with its syntactic com¬ 
ponent 

(C) Attributes associated with its syntactic component 

(D) None of these 

10 If an error is detected within a statement, the type 

* assigned to the Statement is: 

(A) Error type 
(C) Type error 


(B) Type expression 
(D) Type constructor 


Consider 

n data for questions 1 seniant ic rules for 

wring expression grammar. ' gram mar pro¬ 
fit evaluation arc stated next to eac i2005) 


„ r*n:nVKf?r 


£ val- number.val 


questions 


- ab0V e grammar and the semantic rules are fed 

1 . (A) The atm ^ jf . af) LALR (l ) parser gener _ 

t0 for parsing and evaluating arithmetic expres- 
at ° r which one of the following is true about the 
Sl ° nS ‘ fyacc for the given grammar? 
action o ursion and eliminates recursion 
(A) cduc e-reduce conflict, an ^J|jg^ 


i mi It detects red 

ciJl.vrtl (**) i., 
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' 


n hp able to understand. 

Aft er reading this chapter, you wi 
rode optimization basics 
: principle sources of optimization 
. L0 „p invariant code ntotion 

. strength reduction on mductio 

. Loops in flow graphs 


Pre-header 

Global data flow analysis 
Definition and usage of variables 
Use-definition (u-d) chaining 
Data flow equations 


SSSSsss 

tra " sforma,ions are 

called optimizing compilers. 

Properties of the transformations of an 
optimizing ""P**" the roea „ ing of programs. 

amouni ' 

3. A transformation must be worth the ettort. 


Places for improvements 

1. Source Code: 

User can - profile a program 

- change an algorithm 

- transform loops 

2. Intermediate code can be improved by improving 

- Loops 

- Procedure calls 

- Address calculations 

3. Target code can be improved by 

- Using registers 

- Selecting instructions 

- Peephole transformations 


Optimizing compiler organization 

This applies 
. Control flow analysis 

• Data flow analysis 

• Transformations 


Issues in design of code optimization The issues in the de¬ 
code optimization are 
j Target machine characteristics 

2. Target CPU architecture 

3. Functional units 


t machine Optimization is done, according to 6* 
one can optimize single piece of compiler code. 

_ ' J a^J/ 1 ll 


)ne LcllI upi* 

CPU architecture Thu issues .0 be consul 
m with respect to CPU architecture 
umber of CPU registers 
[SC Instruction set 
SC instruction set 

pelining . ^ts, 

,na!mUs Based on 

lone. So that instructions can 




iple Sources °f 

e improving .ransfom«..o» 

Global transformations- 


J 
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.r-formations can be performed by w,, 
^I^cnnnabasicb.ocU.O.herwisc^^ 

fffi- 


\ 


|l 


tio n preserving Transformations 

K'n«^’ ma ' imS impr T thC pr °S ram "ilhoul chani! 
" comlK " cs - So,nc of ,he * 'ransrormmioS 


(O) V = '»« 

US) 




(19) sum_», : 

(20) label Z. ( , 

( 21 ) Procend sum 


Thee 


Ch “P<«M Code Optimization 


3.8S7 


.. inlIll0 n sub expression elimination 
! ropy propagation 
: pead-codc elimination 
4 Loop optimization 
. code motion 

Induction variable elimination 
. Reduction in strength 


Common sub expression elimination The process of iden- 
tifving common sub cxpicssions and eliminating their com¬ 
pulation multiple times is known as common sub expression 

elimination. 

Example: Consider the following program segment: 

int sum_n, sum_n2, sum_n3; 
int sum (int n) 

I 

Suir,_n = ( (n) * (n + 1) ) /2; 

sum_n2 = ( (n) * (n+1) * (2n+l) )/6; 

sum _n3 = (((n)Mn+l))/ 2 )M((n)*(n+l))/ 2 ; 

1 

Three Address code for the above input is 
(0) Proc-bcgin sum 
(1) /„:=«+ • 

(2) 

(3) /,: = //2 

(4) sum_/i = t 2 

(5) /,: = n + 1 

(6) / 4 : = n* t 3 

(7) t y 2 * n 

( 8 ) t 6 : = t $ + 1 

(9) ty = t A *t b 

(10) f s : = tJ6 

(11) sum_n,: = t s 

(12) /,: = ;»+ 1 

(13) / |() : n * 

(14) t ■ t /2 
05) /,,: = «+ 1 

(16) t l} : — n * t a 


Tll ' U ls ' (, "'*(;i + 1 ))' 2 iscomn , (l , 7) arc csscn,iall y same. 

*t is the mm. S t0m P"tcd. 

This cominor " cxprcssion - 
tltc above example!^ CXprCSS ' Hn is com puted four times in 

expressions coin ,tlC C0(Je ,H llavc common sub 

Puted values further <>n * °" CC ^ ,hC " rcusc ,hc com ' 

• • Optimized intermediate code will be 
(0) proc-bcgin sum 
0) V = «+ 1 
(2) 

(3) sultan: = tJ2 

(4) t ; : = 2* n 

(5) / 6 : = / 5 +l 

W 'r = 

(7) sum_//2: = tJ6 

(8) sum_/73: sum_// * sum_/z 

(9) proc-cnd sum 

Constant folding The constant expressions in the input 
source are evaluated and replaced by the equivalent values 
at the time of compilation. 

For example 10*3, 6 -M01 are constant expressions and 
they are replaced by 30, 107 respectively. 

Example: Consider the following ‘C code: 
int arrl [10]; 
int main ( ) 

{ 


arrl 

arrl 


[0] = 3; 
[11 = 4; 


Unoptimized three address code equivalent to the above fc C 
code is 

(0) proc-begin main 

(1) V = 0*4 

(2) ty = &arrl 

(3) M: = 3 

(4) ty = 1*4 

(5) ty = &arrl 

(6) / 3 W : = 4 

(7) Label L 0 

(8) Proc - end main 

In the above code, 0*4 is a constant expression its value 
= 0. 1*4 is a constant expression, its value = 4. 

... After applying constant folding, optimized code will be 

(0) proc-begin main 

( 1 ) v = ° 

(2) (,: = &arrl 

(3) /, [fj: = 3 

(4) = 4 
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(5) If**? 

(6) ty [/;]•' = 4 

(7) label L 0 

(S) proc - end nl3,n tion . if there is an 

arc tw0 copy 

EM mp.c: In 'he P rcvl0lls ’' 

statements. 

ApXtingcopy P^alion. .he op.in.ized code «W be 

(0) proc-begin main 

(1) V = 0 

( 2) ty = &arrl 

(3) /, [0]: = 3 

(4) r ; : = 4 

(5) r 3 : = &arrl 

(6) f, [4]: = 4 

(7) Label L 0 

(8) proc-endmain 

In the three address code shown above, quadruples (1) 
and (4) are no longer used in any of the following statements. 

(1) and (4) can be eliminated. 

Three address code after dead store elimination 

(0) proc-begin main 

(1) = &arrl 

(2) [0]: = 3 

(3) t „: = &arrl 

(4) rJ4]: = 4 

(5) Label L 0 

(6) proc-endmain 

In the above example, we are propagating constant val¬ 
ues. It is also known as constant propagation. 

Variable propagation Propagating another variable instead 
of the existing one is known as variable propagation. 

Example: int func(int a, int b , int c) 

[ 

int d, e, f; 
d = a; 

If (a > 10) 


{ 


e = 


= d + b; 


} 

Else 

{ 

e = d + c; 


} 

f = d< 
return (f) ; 


e; 



Three address code (unoptin,^ 
(0) proc-begin func 
(1) </: = <*] 


(2) if a >10 goto L n 

( 3 ) goto L, 

(4) label : /- 0 


(5) e: = (I + b 


(6) goto L l 

(7) label : Z,, 


(8) e: = d + c 


(9) label : L, 


(10) / : = d*e 


(11) return/ 

(12) goto L i 

(13) label: L y 

(14) proc-end func 

Three address code after variable (convi 

1 yj Propagation; 

(0) proc-begin func 

(1) cl:=a 

(2) If a >10 goto .L q 

(3) gotoZ,, 

(4) label: L 0 

(5) e: = a + b 

(6) goto L 2 

(7) label: L { 

(8) e: = a + c 

(9) label: Z,, 

(10) f. =a*e 

(11) return / 

(12) goto L i 

(13) label: L } 

(14) proc-end func 

After dead store elimination: 

In the above code (1) d: = a is no more used 
Eliminate the dead store d:=a 

(0) proc-begin func 

(1) If a > 10 goto L q 

(2) goto L x 

(3) label: L 0 

(4) e:=a + b 

(5) goto L 2 

(6) label: L x 

(7) e: a + c 

(8) label: L, 

(9) /: = a*e 

(10) return/ 

(11) gotoI 3 

(12) label: L. 

(13) proc-end func 


. ’ • . 
': ■ r .- 
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, code elimination Eliminating , hc code ,. Mt 
P ^, <c c,.tcd by the program is known as n’, " CVcr 

£ i»..« " r -'‘"“-' cs ‘" c mc, " ur > by .he 

L.|*‘ ron ''‘ 1Cr " K U "<*«">l«d l„ lcnn= 

p«c-bcei» r ,inc 

debug: = 0 

n'debug = = I goto L n 

(3) goto 
|4) label: L a 
p) param c 

(6) P‘ iran1 

,7) param a 
|S) param lei 

( 9 ) call printf 16 

(10) retrieve to 

(11) label: L, 

(12) ty = a + b 

(13) /,: = /, + c 

(14) »V = ': 

(15) Return v, 

(16) goto L 2 

(17) label: L, 

(18) proc-cnd func 

In copy propagation, debug is replaced with 0, wherever 
debug is used after that assignment. 

Statement 2 will be changed as 
If 0 = = I goto L () 

0 = = 1, always returns false. 

The control cannot flow to label: L () 

This makes the statements (4) through (10) as dead 
code. (2) Can also be removed as part of dead code ehmina- 
lion. (1) Cannot be eliminated, because e ug is ag 
variable. The optimized code after elimination of dead 

s shown below. 

(0) proc-begin func 

(1) debus: = 0 


nc 

. v wccanusealgcbnticidenttt.cs 

ormation xamp |e 

ode further- for 

entity: a*' =a 

ith0:n*0 = 0 


PtCf4 ^* od< 2 Optimization | 3.059 

nyat ruct; 

int a [201 ; 
int b; 

1 xy Z ; 

ir >t func (int i) 

( 

x yz.a[i] = 34; 

) 

The Unoptimizcd three address code: 

(0) proc-begin func 
(1) ly = &xyz 


(2) r,: = 0 


(3) i 2 : = i* 4 

(4) /,: = /, + /, 

(5) U',] = 34 

(6) label: /, 1( 

(7) proc-cnd func 

Optimized code alter copy propagation and dead code elim¬ 
ination is shown below: 

The statement f ( : = 0 is eliminated. 

(0) proc-bcing func 

(1) ( a =: = Scxyz 

(2) = i*4 


(3) /, :- 1 2 +0 


(4) f 0 [/,]: = 34 

(5) label: L 0 

(6) proc-end func 

After applying additive identity: 

(0) proc-begin func 

(1) t 0 : = &xyz 

(2) ty. = /*4 


(3) /,: = /, 


(4) /„ [/,]: = 34 

(5) label: L n 

(6) proc-cnd func 

After copy propagation and dead store elimination: 

(0) proc-begin func 

(1) l 0 : = 8txyz 

(2) /,: = i*4 

(3) r 0 [/,]: = 34 

(4) label: L a 

(5) proc-end func 

Strength reduction transformation This transformation 
replaces expensive operators by equivalent cheaper ones on 
the target machine. 
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. ... - v *2 is replace* 1 - V ‘ 

.... 

loop option We can 

Loop invariant code mot '°" valuc , which 
The statements within a loo] called loop 

do not vary throughout the life of the '°°P 

invariant statements. .. 

Consider the following program fragment. 

int a [100]; 

int func(int x, int y) 

( 

int i; 
int nl, n2; 
i = 0; 
n, = x*y; 
n. = x - y; 

while (a [i] > (nj*n 2 ) > 
i = i + 1 } 
return (i); 

) 

The Three Address code for above program is 

(0) proc-begin func 

(1) / : = 0 

(2) n t :=x*y 

(3) n, :=x-y 

(4) label: ! 0 

(5) t 2 : = i*4 

(6) /, : = &arr 


Alter loop invariant code motion transf, 


0r mai 


'on 


Uts 




(7) » 4 : = /,[f 2 ] 
[( 8 ) : = ii^*n 2 


'f < 4 >t s gotoZ.| 

(10) goto !, 

(11) label;! 

( 12 ) /: = /+) 

(13) goto L lt 
04) label: ! i 

(15) return i 

(16) goto ! } 

(17) label: L } 

('H) proc-end func 


In the above code si-u^ 

nts (6) and (8) arc invariant. 


will be 
(()) proc-begin func 

(1) i: = 0 

(2) /(, : = •'*>• 

(3) " 2 :=x-y 

( 4 ) t s : = &arr 

(5) /,: «,*« 2 

(6) label : L lt 

(7) / 2 : = t*4 

(X) V = / jW 

(9) if / 4 > /, goto L, 

(10) goto /. 2 

(11) label: 

(12) /: = /+! 

(13) goto 

(14) label: L, 

(15) return / 

(16) goto !, 

(17) label: L 3 

(18) proc-end func 


Strength reduction on induction variobles 

Induction variable: A variable that changes by a 
quantity on each of the iterations of a loop is an induct^ 
variable. 

Example: Consider the following code fragment: 

int i; 
int a [20]; 
int func( ) 

{ 

while(i<2 0) 

( 

a[i] = 10; 
i = i + 1; 

} 

) 


The three-address code will be 


A f‘ er 

y 

( 0 ) 

(0^ 

(1) 

( 2 ) 

(3) 

(4) 

(5) 

( 6 ) 
(7) 
(8! 
(8» 
(9 

(>l 

(H 

0! 


Lp 

b 1 


(0) proc-begin func 


t 

. • •/ 

(1) label: L 0 


• 

! 

(2) il’/< 20 goto I, 



(3) goto !, 



(4) label: L t 


. * 

(5) /„: = /*4 

V t- 


(6) /j: = &a 


: ‘ 

?■ \ . 

II 

o 


\\ ;. 

(X) / : = /+ 1 

■ y ■ i 

* .v, 

(9) goto ! (J 


• 

J . • ; 

(10) label:!, 



(11) label:!* 

• . . ■ 

r . 

i • . 

(12) proc-end func 
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. rCt |udion of strength the code w 
'°'iW. s '* 4ismovcdoul «0he loo 


Ct-^'o 


4. 


’ P "" ll(R)i 5 f»IK ll j '’ 1 W*l:l 

to* 

(9 > 'r = * + | 

no),' ;s/ 

(1 *) goto [ 

^*2) label ;'[ 

,l3 ) return y 
(l4 > goto L 
05) label; L x 
06) proc-end func 


'-napter d 


C °rle Optinriizar.io n 


fcV * 

,01 ***** runc 

on) 

;) label. I-,, 

(2) if/<20 gotoL, 

goto /-; 

( 4/ label:/-, 

(?) 

(6) ', ; = &rt 

(7) : " 10 

(8) i ■ =' + 1 
(8a) t 0 : = f o + 4 

(9) go 10 K 

(10) label: L, 

( 11 ) label: L, 

(12) proc-end func 


Loops in Flow *Qraphs 

Loops in the code are detected during the data flow analysis 
by using the concept called ‘dominates’ in the flow graph 

Dominators 

A node d of a flow graph dominates node n, if every path 
from the initial node to V goes through 'd\ 

It is represented as d dom n. 

Notes: 

1. Each and every node dominates itself. 

2. Entry of the loop dominates all nodes in the loop. 

Example: Consider the following code fragment. 

int func(int a) 

1 

int x, y; 

X = a; 

y = a; 

While (a < 100) 
l 

y = y*x; 
x = x+1; 

) 


3.861 


ffl 

ti 


Thc Flow Graph for above code will be; 


Bo 



return(y) ; 


ln cnl opti' niza,ionwi " bC 

The Three Address code after 


(0) proc-begin fu nC 
U)jc: = ci ‘ 

12) v: = a 

(2) label: L 0 

14) if fl < 100 goto Ly 

(5) goto L 2 


To reach it must pass through B x 

B ] dominates B r Also B Q dominates 5 V 

Nominators [B { ] = {B 0 , B \} (or) dominators [1] = [0, 1} 

The dominators for each of the nodes in the flow graph 
arc 

dominators [0] = {0} 
dominators [l] = (0, 1} 
dominators [2] = {0,1,2} 
dominators [3] = (0, 1,3} 
dominators [4] = {0, 1,2,4} 
dominators [5] = {0, 1, 2,4, 5} 

Edge 

An edge in a flow graph represents a possible flow of control. 
In the flow graph, B { . to B { edge is represented as 0 -> 1. 

Head and tail: In the edge a —» b , the node b is called head 
and the node a is called as tail. 

Back edges: There are some edges in which dominators 
[tail] contains the head. 
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ipic: Consider below flow graph: 



The dominators of each node are 
dominators [0] = {0} 
dominators [1] = {0, 1} 
dominators [2] = {0, 2} 
dominators [3] = {0, 1,3} 
dominators [4] = {0, 2,4} 
dominators [5] = {0, 1, 3, 5} 
dominators [6] = {0, 2, 4, 6} 
dominators [7] = {0, 7} 


Header of the hop The entry of , hc , 

tlic header of the loop. 

l oon exit hock In loop L, can be exited 
block B, H is called loop exit block.The blo 
exit block for the loop L v It ts possible t 0 w 
blocks in a loop. ^ 


% 


Dominator tree 

A tree, which represents dominate inform; 
i« n dominator tree. In this, 


A tree, -- . 

of tree is a dominator tree. In this 

. The initial node is the root. 

• Each node d dominates only its descend 

Consider the flow graph 


nation 


°y 


't Hit 


lcntsi tthe^ 





Edge 

Head 

Tail 

Dominators 

[head] 

Dominators [tail] 

0-»1 

1 

0 

{0,1} 

{0} 

0 —► 2 

2 

0 

{0, 2} 

{0} 

1 -» 3 

3 

1 

{0,1,3} 

{0,1} 

3 -> 1 

1 

3 

{0,1} 

{0,1,3} 

3->5 

5 

3 

{0,1, 3,5} 

{0,1,3} Backedge 

5 -> 7 

7 

5 

{0, 7} 

{0,1,3,5} 

2->4 

4 

2 

{0, 2, 4} 

{0,2} 

6 -> 2 

2 

6 

{0,2) 

{0, 2, 4, 6} 
Backedge 

4 -> 6 

6 

4 

{0,2,4,6} 

(0, 2, 4} 

6 -> 7 

7 

6 

(0, 7} 

{0, 2, 4, 6} 


The dominators of each node are 
dominators [1] = {1} 
dominators [2] = {1,2} 
dominators [3] = {1,3} 
dominators [4] = {1,3,4} 
dominators [5] = {1,3,4, 5} 
dominators [6] = {1, 3,4, 6} 
dominators [7] = {1,3,4, 7} 
dominators [8] = {1, 3,4,7,8} 
dominators [9] = {1, 3,4, 7, 8,9} 
dominators [10] = {1, 3,4,7, 8,10} 


The dominator tree will be: 



JkJ |B, B a.( form a loop (a,. a,} form anchor 


l? 0P ’ W <*«» W dominates all note in 





Re 

Af 

the 


(1 

(2 


E: 


I 
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Reducible Flow Graphs 

A flow graph G is reducible if and only if we can partition 
die edges into two disjoint groups: 

(1) Forward edges 

(2) Backward edges with the following properties. 

(i) The forward edges form an acyclic graph in which 
every node can be reached from the initial node of G. 

(ii) The back edges consist only of edges whose heads 
dominates their tails. 

Sample: Consider previous How graph 


Chapter A Code Optimization | 3.863 

<Hc ^ 0VV graph, there are five back edges 
w 4 7 ~> 4. 8 -> 3. 9 -> i and 10 -> 7 

Remove al! b ack c d gC s. 

■pi.' nin 'ning edges must be the forward edges, 
remaining graph is acyclic. 



It is reducible. 


Global Dataflow Analysis 

Point: A point is a place of reference that can be found at 

1. Before the first statement in a basic block. 

2. After the last statement in a basic block. 

3. In between two adjacent statements within a basic block. 


Example 1: 


a-= 10 
b = 20 
c’ = a* h 


B 


i 


Here, In B x there are 4 points 
Example 2: 


B. 


proc-bcgin func 

• Pr *, ~ 

V = V + V 

V 3 I 2 

‘ P 3 - *, 

ifc> 100 uoto L u 

•P*~P> 



There is 4 point in the basic block B r given by P x - B x , 
P r B r P i -B [ wdP i -B ] . 

Path: A path is a sequence of points in which the control 
can flow. 

A path from P, to P n is a sequence of points P x , P 1 ,..., P n 
such that for each / between 1 and n-l, either 

p i s t he point immediately preceding a statement and 
' p is the point immediately following that statement 

in the same block. 

(OR) 

(b) p is the end of some block and P„ is the beginning of 
a successor block. 
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Path is between the points P 0 - b 0 and P h - b 2 : 

The sequence of points P 0 - b n , P t - b (l , P 2 - b n , P } - b 0 , 
P A - />,, - b 2 and P h - b r 

Path between P y - b ] and P b - b 2 : There is no sequence 
of points. 

Path between P 0 - b Q and P n - b } : There arc two paths. 

(1) Path 1 consists of the sequence of points, P 0 -b u , B, - /> u , 

P, - b v P> ~ K P > ~ h <>' P * - /J i and P i ~ K 

(2) Path 2 consists of the sequence of points P 0 - b u , B, - b n , 
p _ b 0 , P 3 - / V P a ~ P > ~ p b ~ I’r P i ~ and 
P 7 - b y 


Definition and Usage of Variables 
Definitions 

It is either an assignment to the variable or reading of a 
value for the variable. 


Use 


Use of identifier a- means any occurrence of.v as an operand. 
Example: Consider the statement 
-v=.v + z; 

In this statement some value is assigned to .v. It defines v and 
used yand rvalues. 


Global Data-Flow-Analysis 

Data Flow Analysis (DFA) is a technique for g 

nforniatKin about the possible set of values calct 
xanous points in a program. 


sk'tir 


• An example of a data-flow analysis is read,- 

• A single way to perform data-flow analy S j' n ^ t % > 
lo setup data flow equations for each node 

flow graph. ° 

Use definition (U-d) chaining 

The use of a value is any point where that v at jy, 
slant is used in the right hand side of an 
evaluating an expression. 

The definition of a value occurs implicitly ai .i. 
ning of the whole program for a variable. * 

A point is defined cither prior to or immcdia*.;, y ., 
statement. 4,tr » 

Reaching definitions 

A definition of a variable A reaches a point P if ^, 
path in the flow graph from that definition 
other definitions of A appear on the path. 


Example: 


By 



The definition A: = 3 can reach point p in B y 

To determine the definitions that can reach a givenpro- 
gram first assign distinct numbers to each definition, sine: 
it is associated with a unique quadruple. 

• For each simple variable A , make a list of all definitions of 
A anywhere in the program. 

• Compute two sets for each basic block B. 


Gen [B] is the set of generated definitions within blow 1 
and that reach the end of the block. 

1. Kill [B], which is the set of definitions outside of#‘j 
define identifiers that also have definitions withi-^ 

2. IN [B], which arc all definitions reaching the poisU 
before B s first statement. 

Once this is known, the definitions reaching an) uSv 
within B are found by: , ^ 

Let u be the statement being examined, whic ^ 

1. If there are definitions of A within B befof e 

is the only one reaching u. . w it? 

2. If there is no definition of A within B P n0f 
reaching u are in IN [B]. 

Data Flow Equations 

1. For all blocks B, 

OUT [BJ = (IN [B]- KILLED 
A definition d, reaches the end of B if 
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j% \B] and is not killed by B. 


fb) 


(or) 

,-j jt eeneratcd in B and is not subsequently 
here. 


redefined 


2 . IN [B] = U 0UT t P 1 

V P preceding B 

A definition reaches the beginning of B ifT it reaches 
the end of one of its predecessors. 


Computing U-d Chains 

|fa use of variable V is preceded in its block by a definition 
0 f V, this is the only one reaching it. 


Chapter 4 Code Optimization | 3.865 

If no such definition precedes its use. all definitions ot 
'a in IN [B] are on its chain. 


Uses of U-d Chains 

1. If the only definition of V reaching this statement 
involves a constant, we can substitute that constant lor 

2. If no definitions of V reaches this point, a warning can 

be given. . , 

3. If a definition reaches nowhere, it can be climtnatec. 
This is part of dead code elimination. 


L~__ 

Practice Problems I 

Directions for questions I to 15: 
five from the given choices. 


. la-.mwpm 

6. In block II. if.v ory is assigned llicrc and .v is not in II 
then .v: .v = v is 

Select the correct altcrna- Generated (B) Killed 

(C) Blocked (1^1 Dead 



1. 


7 


Replacing the expression 2 * 3.14 by 6.2S is 

(A) Constant folding 

(B) Induction variable 

(C) Strength reduction 

(D) Code reduction 


expression <<i'MV <>P ••• '°l ’’ is °" c of 

ind -T- (exponentiation) can be evaluated o„ CPU 
, a sinule register without storing the value ol (a h) 


if 

(A) *op‘ is *+* or 

(B) -op' is ‘T'or *+' 

$ norpossibleto*evaluate without storing 
Machine independent code optimization can be applted 


to 



) Source code 

j Intermediate representation 

) Rutf" 


is no subsequent 
copy statement 


7. Given the following code 
A = x + y; 
fl = .t+y; 

Then the corresponding optimized code as 


C = x +• y; 


A = C; 

B = C; 

When will be optimized code pose a problem? 

(A) When C is undefined. 

(B) When memory is consideration. 

(C) C may not remain same after some statements. 

(D) Both (A) and (C). 


8. Can the loop invariant X = A - B from the following 
code be moved out? 

For i=l to 10 


{ 

A = 5 * C; 
X = A - B; 


9. 


} 


(A) No 

(B) Yes 

(Q x=A - B is not invariant 

(D) Data insufficient 


ry path from the initial node gees through a par* 
■ node, then that node is said to be a 
leader <*» t)omina I 0f 
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Chapter 3 

Generation 
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•C 


gARNtNG OBJECTIVES 



rtcr reading this chnpfor, yr ,,, vv . 

I • r ■ ■ • '■ . . 

j . Three add"”--, ax* 

Symbol op-emt^ 

Aswgnmaot statem*™-, 

[j • 8oohN»ne>|jffiv ?>n 

. Fw#comn>;of Mnirymo nu , 


Procedure ca’ls 
Code generation 

* w ^xf t«o information 

• Run-time storage management 

. Fto ’ roi,rr s-entat.ons 0 f basic blocks 
ef ’ Dhol ° optimi/atron 



Introduction 


|ir,I.V*n»I>*.«v tynlhcm. mode!. thr lr,», 

^ jK»mrtp* «*dc «»« <IK» Imw, Ift o* 


;W-, 




/ \ j 

-j , f&fcC* Tar-^ 

' ° i ond "-^ xer. 




• \ lnt7r, OCyC,,C gr ° phl ^ or **t>rc*slon: (DAG) 

< » ,f * *nw"S" ,hcCom "™ *»hexpwwUma 

V f' tic* ,\ jjj . f)\/‘ i 

common nub expression. C ^ *** parrnC » r /V rcprcx<rm% „ 

t>A<< ***** !h * Compiler importjni , i 

<~n «f efficient code to lhe CT ^X? Hg "* KC,ter “- 


§1 


nawjx.10^ in /jy 

^-.•cb^Ws -T *^**"* 


t wmpU I: DAO foru > «*<* * (/ , _ r>v 

*~ p n 


P. 


1 a “ ditlcrcnt t>iv» of tntrrmcdatc rcfn^scetat»os.> 

B . '* u UK. j c . \s J (Abstract Svruuv trcci 

I ' > 

■ • ; ' v1 e ’ ’’ Vvi U>«tvtdi \v>vfw ckupfet 

' -4Km(Rcsctv- IVird: Nv-uiwn. RP\i 
| ^»v-n !rx . V! 

I *.H, ' “ ncv5kk ' v *’fv-4jv uc have discu»cd aSsct AST and 


J 

V 






: ,jj 

t i 


' ' ;V« 




/ 


P.° 


r » 


'VS 


l»strt iattjtat* catlc jjrncration: The bcardis of ICG 


[ ; ; " : •** *? 4am m «|Xnrua«d <wde. 

.. _ ' liM xbt Jjftttcitf machines b>- 


. 


• *•” ‘X ;'o/ the ***4«&X kttgs«HpcS. 


r - mjtkc'dfuA^) 
r. « »sair!cxfikl c P 
P\ »taikdtif (rJL <M 
^ k nwdricaf (>d cj 
P 4 3 mairci-xijr i/\, 

P = tnaktrxxic t\ ? . Pj 
P 7 =a rr^crKxk 1 P . »FJI 
r 5 - radfaekaf (id b; =■ P i 
t\ ^mdu^catf (tdc) - 
sr rnaixwdii (-. Z 5 ^ : 


r^. 


. .• 

'. - v^3 
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,» e r reading this chapter, you will be ahio t 
, introduction 0 Un< ^ er stand: 

. e rected Acyclic Graphs (DAG) 

. Three address code 
. Symbol table operations 
. Assignment statements 
. Boolean expression 
. Row control of statements 


Procedure calls 
Code generation 
Next use information 
Run-time storage management 
DAG representations of basic blocks 
Peephole optimization 


PRODUCTION 

lie analysis-synthesis model, the front end translates a so, 

sp, m,o an intermediate representation <IR) From rT, 
aiend generates target code. om 1R the 



D,Ve«ed acyclic graphs for expression: (DAG) 

in itonif,es ,hC «* expression 

’ ,ta " ™ e Par “' represents 

tion of efficient code P to equate the ^ 

Example!: DAG for a + a*(b - c ) + (b - c )*d 


re ^'^ erenl rypes of intermediate representations: 

* Meri,'’ e '’ ^T (Abstract Syntax Tree) 

’ Low i&h hd ''' e " ^ lree address code 

* Postfiv v • *’ e ’’ (Directed Acyclic Graph) 

Lith ' * ° latlon(Rever se Polish Notation, RPN). 

US SL ' ct ‘ ons already we have discussed about AST and 
** U ° f Intermed 'ate code generation: The benefits of ICG 


^ a ” °brain an optimized code. 




L W 

2 . c 0m ‘ w “ lulu an optimized code. 

crcated for the different machines by 
machine 1 ^ ac ^ end t0 ex >sting front end of each 

TS can De created for the different source languages. 


PiP 2 ^ 


P, = makeleaf (id, a) 

P, = makeleaf (id, u) = P, 
P 3 = makeleaf (id, b) 

P, = makeleaf (id, c) 

P 5 = makenode (-, P,, P 4 ) 
P 6 = makenode (*, P,, P } ) 
P 7 = makenode (+, P,, P„) 
P s = makeleaf (id, b) = P 3 
P 9 = makeleaf (id, c) = P 4 
P = makenode (-, P s , P 9 ) 



= P. 
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frcs pondin6 three address code w in bc ^ 


J.S3B 

'•^ n ,nU-n<’ lk( ' 

r " .-,r 1,1 

.-■« V. it 1 ' 

I 


;vnn>l 1,f *' '' 



DAG 




= /. + 4 


TH REE -AoDR E ss_C 0 ( « cnl ,| S ,„,,yco 


3 


v contains 

, n three address each s< J for(hc result. 

addresses, 2 for oper 

Example: -,v=.i'OP- compl icr g cncra 

. Yi ,, arc names, constants 

temporaries, arithmetic operator (o 

. OP stands for any operator. Any a 


Locical operator. 


Eomple: Consider Ihc slatcmcM 

' (*' V N 

/ Unary-minus / \ 
v i /Ur 


.v = V * 


•* -Z 


Unary-minus 
/ I 
z 


Thc postfix notation for syntax tree is: ^ 
unarynt' nus * + ' 

. Three address code is a Lmeanzed itp**,^ 

r-c data of all variables can bc formulated as SVm 
* j jrcctec | translation. Add attributes whenever r— ^ 

Eiamplc: Consider below SDD with ^ 

f^ghThare E. place and E.code 
E place: the name that holds the value of E. 

£ code: thc sequence of intermediate code starts evaluating £ 
Let Newtemp: returns a new temporary variable each time 
it is called. 

New label: returns a new label. 

Then the SDD to produce three-address code for expressions 
is given below: 


Semantic Rules 


Production 

S-4 id ASN E 

£-> E y PLUS E 2 
E-i £,MUL E 2 

E. Place = Newtempof 
E-> UMfNUS E, f; code = e rnHo ^ 9 e n (NEG. £ Place, place); 


S. code = E.code \\ gen (ASN, id.place, Eplace ) 

E. Place = newtemp (); 

E. code = E,. code || £,. code || gen (PLUS. E. place. E,. place. E,. place)- 
E. place = newtemp(); 2 

c ° da 119en (MUL ' E place ’ E '- pla “- A- 


E-> LP E, RP 
IDENT 


E. code = £ r code 

E. Place = E y p| ace 

E.place = id. place 
E. code = empty, list q : 


irr hreeAdd - s -- nt 

* Unary assignment: = 0n . 

operation on y t 0 x ^ ^ ‘^ 0rc the 


Copy 

•'W :=y SK.re co men|s 


resu It of v op - 

rcs “" of unary 


+ Olh address. l ' n,s oP -Kr] to x 



'ess endpoi)it er manipulation 

&>’ Store address of y to x 

Y - _ 5*C 

3 Store the contents of y to x 

*Y •_ 

3 Store y to location pointed by x - 

Jump 

• Conditional* 31 jUmp> g0t ° L ’ JUmpS * L 

a f (X rei op y) 

9ot 0 r . ‘ y> 


ei 3f 


f 
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Implementation ofThree 
Address Statements ented a s records 

Three address statements ca "J the operands. Then, are 
with fields for the operato 

types of representations. 


1. Quadruples 

2. Triples 

3. Indirect triples 


Quadruples arid - arg2 ^ ^ 

A quadruple has four e 

A not US^ 


Example 1: For the expression x = y * - r 4- y * - the 
quadruple representation is 



OP 

Argl 

Arg2 

Result 

(0) 

(1) 

Uminus 

z 

V 

t, 

t, 

t, 

u 

u 

f 

(2) 

(3) 

Uminus 

z 

y 

4 

t, 

(4) 

+ 

h 

_ 

l r 

X 

(5) 


- - 



Example 2: Read (r) 


p,iram-„' procedure p with n parameters and 

Call P - v: store the result in x. 

Use v as result from procedure. 

return x 

peclaiof 101 Declare a global variable named .v at ofT- 

, Global A- n i» r 

having bytes of space. 

S 1 - /7 //,: Declare a procedure .v with //, bytes of 

* ^eter space and /», bytes of local variable space. 

, .. Declare a local variable named x at offset m 

• Local a, "*• 

from the procedure frame. 

. End: Declare the end of the current procedure. 

. . f or object oriented code 

. J, field i Lookup field ^ ! ' vi,hin * addreSS 

. Class x, "r dec,ar ® ^’oTdi's method pointers. 

class variables and h, by offset n in the class 

. Field x, n: Declare a field named 

frame. . f c Iass name.c. 

. New.v: Create a new instance of 



Example 3: WRITE (A*B,x+5) 



Triples 

Triples have three fields: OP, argl, arg2. 

. Temporanes are no. used and instead references to 

instructions are made. 

. Trinles are also known as two address code. 

. Triples takes less space when compared withQuadnipks. 
. Optimization by moving code around is difficul . 

i~v a t~\ frinif. r^nresentations of expressions are 


equivalent. 

. For the expression a=y*~ 
tion is 

z+y*—z 

the Triple representa- 

Op 

Argl 

Arg2 

( 0 ) Uminus 

z 


(i) 

y 

(0) 

( 2 ) Uminus 

z 


(3) 

y 

(2) 

(4) + 

(D 

(3) 

(5) 

a 

(4) 

Array - references 

Example: For A [/]: = B . the quadruple representation is 



Op 

Argl 

Arg2 

Result 

(0) 

(1 = 

A 

1 

Tj 

(D 

= 

B 


L 


The same can be represented by Triple representation also. 

[] = is called L-value, specifies the address to an 
element. 


The contents 


names 


entries for the Iia “‘ . move 
Easier to opth 11lze 


;preSen cod°e aroUnd ' 
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op. 

Arql 

Arg2 

to 

U« 

A 

/ 

Al) . 

V) 

(0) 

0 

Example 2: l •< 

n' in 



_ 

...Op._ 

Arg1 

Arg2 

(0) 

-11 

6 

/ 

jn_ 


A 

(0) 

l 1 ts called V"value, specifies the value of an element. 

Indirect Triples 




* In indiavt triple 

s, pointers to triples will be there instead 

of triples. 




Optimization by moving code around is 

easy. 

* Indiavt triples 

takes less 

space when 

compared with 

Quadruples. 




l^oth indirect triples and Quadruples are almost equally 

efficient. 




Example: Indirect Triple representation of 3-addrcss code 



Statement 



(0) 

(14) 



(1) 

(15) 



(2) 

(16) 



(3) 

(17) 



(4) 

(18) 



(5) 

(19) 



Op 

Argl 

Arg2 

(14) 

Uminus 

z 


(15) 

* 

y 

(14) 

(16) 

Uminus 

z 


(17) 

• 

y 

(16) 

(18) 

+ 

(15) 

(17) 

(19) 

= 

X 

(18) 




Symbol Table Operations 

Treat symbol tables as objects. 

• Mktable (previous); 

• create a new symbol table. 

• Link it to the symbol table previous. 

• Enter (table, name, and type, offset) 

• insert a new identifier name with type and offset into 
table 

• Check for possible duplication. 

• Add width (table, width); 

increase the size of symbol table by width. 

• Entcrproc (table, name, new table) 

• Enter a procedure name into table 

• The symbol lable of name is new tab | e 

• Lookup (name, table)- 

MS^SE?*'****-* 


Example: 

Declaration -»*'/,£> 

(TOP (Offset) : => 0 ;} 

D-» D ID 

£>_* id; T (enter (top (tblptr), id 
top (offset)); top (offset): = 

+ T. width ; 1 

r-» integer IT. type : = integer; T . » idth< ^ 
redouble (T. type: = double; T. width ‘j * <: t 
r _> * F, (T. type: = pointer (T. typ e); ? ’’ 

= 4;) 

Need to remember the current offset before Cnte , 
block, and to restore it after the block is closed. ‘ 2 * 

Example: Block -» begin M4 Declarations statement, 

(pop (tblptr); pop (offset) ;} ^ 

M —> 6 {t: = mktable (top (tblptr); pus 
tblptr); push (top (offset), offset) ; ' *' 

Can also use the block number technique to avoid cre^ 
a new symbol table. 

Field names in records 

. A record declaration is treated as entering a block ir, 
terms of offset is concerned. 

• Need to use a new symbol table. 

Example: T-* record A/ s D end 

IT. type: = (top (tblptr)); 
T. width = top (offset); 
pop (tblptr); 
pop (offset) ;) 

A/j—It: = mktable (null) ; 
push (t, tblptr); 
push {(o, offset) ;} 

Assignment Statements 

Expressions can be of type integer, real, array and reccri 
As part ot translation of assignments into three site 5 -' 
code, we show how names can be looked up in the syfrh'l 
table and how elements of array can be accessed. 

Code generation for assignment statements gen 
# ^ [assignment], [address #2], operator, address *3); 

i-* ~ - . - 0 f[address^ 


5 . ' 

t .' 



deniable accessing Depending on the type 
generate different codes. 

Types of [address # i] : 

Local temp space 

* Parameter 

* ‘Local variable 

* Non-local variable 

* Global variable 
Registers, constants,., 



Scanned by CamScanner 

























Name) 


error handling routine error - , ns , . 

The error messages can be written !" 1forn ‘«ion)- 
fi | C . Temp space management: ‘ ul slo 'ed i„ olh( , ( , 

. This is used for generating code f or CVn 
, newtemp (): allocates a temp space Cssions - 
. frectcmpO: free t ifit is a1| ocatcd in .. 

L ,c,) ip space 

label management 

. This is needed in generating branHc 
. ncwlabcl (): generate a label in th I 8 Slateme nts. 
never been used. e tar get code that has 

Names in the symbol table 

S-> id: = £ {p: = lookup (id-name t- 

If P is not null th ' t0P (tbl Ptr)) ; 
E.place); th6n < Pf 

Else error Pvar underinec| 

r j 

E—>E + £ {e. place - 

‘ 2 PJ-ace - newtemp () • 

gen (E.place, " : = « P . 

_ ^• place. " 4 -" 

E .Place); free ' 

fr..temp “" P IE1 

(E2. place) ;} 

{E. place = newtemp (); 

gen (E. place, =", "uminus", 

Ej.place); 

Freetemp (E 1 . place ;)} 

£—>(£,) {E. place = E t . place ;} 

E —> id {p : = lookup (id. name, top (tblptr); 

If p# null then E.place = p. place else error 
("var undefined", id. name) ;} 

Type conversions 

Assume there are only two data types: integer, float. 

For the expression, 

E —► E t + E, 

If E r type = E r type then 

generate no conversion co e 
E.type = E V type; 

Else 

.E.type = float; 
temp 1 = newtemp ()> 

If E r type = . floa t, E r P' ace): 

gen (tempi. 111 ^ p | a ce); 

gen (£,•:=’tempi. 

Else , - nt . to - fl° a1, P ** IC j ‘ 

gen ^Mempl‘ + ^'- p,aCe); 

Free C temp(te<”P 1); 

. nrrav eleven* 

Addressing <* rraY 

Let us assume 

element daw 


ae Generation I 3.841 

• Startjiclclr + (/ | ow * 

f He value called hoc \ ~ * w + (start_addr - low *w) 
puled at con n t T ~ ,ow * can be com- 

Example; array f- 8 Tfi vi^r ^ at ,hc symbo1 tablc ' 

To declare [- 8 ] [-71 r 1 om°^ Intc ® cr ' 

2D Array A [/, /1 "' 1 lntcger arra V in Pascal. 

ow major order: row by row. A [/'] means the tth row. 
'strow A [1,1] 

d [ 1 , 2 ] 

2 nd row A [2, \] 

A [ 2 , 2 ] 

*[i>J\=A[Q[f\ 

Column major: column by column. 

A[\,\]\.A[\,2} 

A[2A}\A[2, 2 ] 

1 st Column 2nd column 
Address for A [/,,/J: 

Start _ addr + ((/, - low,) *n , + ( i 2 - low,))*u> 

Where low, and low 2 are the lower bounds of z, and i y n , 
is the number of values that /, can take. High, is the upper 
bound on the valve of /. n 2 = high, - low, + 1 

We can rewrite address for A [i , z 2 ] as ((/, x /?,) + L) 
x w + (start _ addr - ((low, x /? 2 ) + low,) x w). The value 
(start _ addr - low, x n 2 xw- low, x w) can be computed at 
compiler time and then stored in the symbol table. 

Multi-Dimensional Array A [/,, 

Address for A [/,, /,,... zj 


vv 


= h +••• + '*) 

+ _ addr - low x * w * n^ni 


—low 2 * vv * /r^j" 


- low , 




It can be computed incrementally in grammar rules: 

/(D = i,; 

/(/) =/(/ -0* «•+/.; 

/(AO is the value we wanted to compute. 

Attributes needed in the translation scheme for addressing 
array elements: 

Elegize: size of each element in the array 
Array: a pointer to the symbol table entry containing 
information about the array declaration. 

Ndim: the current dimension index 
Base: base address of this array 
Place: where a variable is stored. 

Limit (array, ti) - n m is the number of elements in the wth 
coordinate. 

Translation scheme for array elements 

Consider the grammar 
)->L: = E 
L 
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ComP 


tiler 


Dcs'f. n 


i: 


,,, n' isl1 

(f) 

v ,. s £ |i f L- “7, P lacS 
>L w •/ ?•=" il - 

L. 0ll5 et 


is a 


5' 
5 impi e 

Else 
gen 




(L - • n; 

r place)'* = newtemp 

r‘I>£+£> (E ‘ pl «.-» Ei-P lace ' 
gen lB./P laCe ' 

place) E place) L is a 

£ — <£,) lE P .Ltt: « null then / 

*-*•« I P l,c.:- >• 


place/ 






E, • 


' [" f L.of f set ' 


U Cll ' _ 

v\. //_ *- ^ ,r npy k t stat +->) ; 

top (tblptr)); place, goto , w nex 


simple id a/ 

Else begin 

E .place:=ne«tempO : ^ piace 

gen (E.pla ce - * 

M") > 

.Hid (P! -lookup («.«— 

If P * null then 
Begin 

L.place: = P-Pl ace: 

L.offsets* null; 

End 

£“ t ( "Var underfined"i id. Name) ;> 

. i_»ElistlL. offset: = newtemp 0; 

gen (L. offset, Elist .elesize, 

Elist.place ); 
freetemp (Elist.place)< 

L.Place := Elist . base ; 1 
• Elist —> Elist,, £ (t: =newtemp 0; m: = Elistl. 
ndim+l; 

gen (t, ":=" Elistl.place, limit (Elistl 

array, m)); 

Gen (t, t"+ w , E.place); 

(E.place); 

Elist.array: = Elist.array; 

Elist.place:= t; Elist.ndim:= m ; 

Elist id [E (Elist.Place: 
ndim:=l; 

7“id“ k rors! id ' na " e ' ^ ltblptr, l. dwelt 

7«::77: : :77; 1 Ellst - b — * p-*™.. *-» «.^*= 

• £^i<llP:« lookup (7, 


boolean Expressions 

"ere arc two Voices To, Impta*^ 

, VIir cssio , ' s: . \ 

', Numerical representation 

2 Plow of control 

NumcriMl representation 

. true and false values. 

Enc 7S. 0;fa,sc - 

S of control: Rcrntatming the , aluc 
"“rcssion by a read "< 1 ™ a pro,.,,,, * \ 

chtrtdrcMcmte Generate the code to<*, 
Session in such a way .tat im no, "erase, ,JS 
“evaluate the enttre express,on. % 

’ '7s°t7 then n, is not evaluated 
ifanti ci-, 

tt| is false then a 2 is not evaluated. 

Numerical representation 

p —> id, rclop id2 
(E.place:= newtemp 0 ; 

gen ("if"' idl.place, ^ relop.op, • i4j 




ip* 




' ,if 1 
K^'rO* 
J 


* „ , t . — " "O'M • 

gen (B.place, - >' 

g en ("goto", nextstat+2); 
gen (B.place, •- » 1 > 1 

Example 1: Translate the statement (if a <6orc<dani t 
< J) without short circuit evaluation. 

100: if a < b goto 103 

101: t. : = 0 
102: goto 104 
103: t,: = 1 /* true *1 
104: if c < d goto 101 
105: t.:= 0 /* false */ 

106: goto 108 


v= 1 


if e < f goto 111 


freetemp 


E.place; Elist. 


107: 

108: 

109: t,:= 0 
110: goto 112 
111: t, := 1 
112 

113: t,:= t. or t 4 


t ;= t, and t, 

■i ^ 


Flow of Control Statement 


i 


Check for id error 


t name 


f top (tblptr) 


B.true: = nev;label Of 
B. false := nev/label 0 ? 
B.code:= gen 


7. 



id;' 
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"cfot-.o", B. false) 


tI uc- 


^ c; code— B.coclc || .S', .code ||gcn 
' f;,K- 'concatenation operator. 

' P ,h^' l,dL 


»hC ^ ~ , 

1 ( licn implcmcntatton; 

!• flif# tl,cn51 ( gen (Befalls," 


B.true: 

B.false: 


B.Code 


S-j.Code 


>To B.true 
* To B.false 


? If-then - else 

' p^S {S.next:= newlabel () ; 
p.code:= S. code || gen (S.next," *.")} 
5 -» if5 then 5, else S 2 {S^next:® s.next; 

S. . next : = S.next; 

Secede: = B.code | | Sj.code | | . 

Gen ("goto" S.next) || b. false," : ") 

| |S.. code} 

Need to use inherited attributes of S to define the 
attributes of S’, and S, 


B.true: 


B.false: 

S.next 


B.Code 


S v Code 


Goto S.next 


S 2 .Code 


► To B. true 
* To B.false 


S.begin 
B.true: 


B.false: 


B.Code 






. 



„ B. true 
► B.fals© 


4. Switch/case stateI ^”*viicb case »s 

The c - like syntax 
switch epr 1 s [ 1 ] 
case V l^ 
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Ca8e V Ik): S[k| 

defaults s[d) 


Translation sequence 

Evaluate the expression. 

l‘ind which value in the list matches the value of the 
expression, match default only if there is no match. 
Execute the statement associated with the matched value. 


How to fiml the matched value? The matched value can be 
found in (lie following ways: 

1 • Sequential test 

2. Lookup table 

3. I lash tabic 

4. Back patching 

Two different translation schemes for sequential test are 
shown below: 

1. Code to evaluate E into t 
Goto test 

L[i]: code for S [1] 
goto next 

L[k]: code for S[k] 
goto next 

L[cf\: code forS[r/] 

Go to next test: 

If t= K[l]: goto L [1] 


3. While loop: . , v 

5^ id, relop id, B.true-newlabel 0, 

B.false— newlabel U, 

B.code:=gen ( if . ?^ 10 - q. false) II 

id,, 'goto'. B in* 'else • 8 0l ° 

gen (B.true ‘l )' —label 01 
awhile 5 dO s 5, od S;^-- bcgin ,, || 

B.code||Sl.codell8» fa|sc> ,'); 

(.goto’. S.begm) IIS“ 


goto L[d] 

Next: 

2. Can easily be converted into look up table 
Iffo V [ 1 ] goto L [1] 

Code for S' [ 11 
goto next 


L [1]: if <<> V [2] goto L [2] 
Code for S [2] 

Goto next 

L[k- 1]: if r < > V[k] goto L[k] 
Code for S[k] 

Goto next 


L[k]: code for S\d] 
Next: 


... •assffSSi 


r? 

: • 

If 


l:. 

} 

i 
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| Compiler Design 

USC a tabIc and a loop to find the address to jump 



Hasli table: When there are more than two entries 
use a hash table to find the correct table entry. 

4. Back patching: 

Generate a series of branching statements with the 
targets of jumps temporarily left unspecified. 

To determine label table: each entry contains a list 
of places that need to be back patched. 

Can also be used to implement labels and gotos. 

Procedure Calls 

Space must be allocated for the activation record of the 
called procedure. 

* Arguments are evaluated and made available to the called 
procedure in a known place. 

* Save current machine status. 

* When a procedure returns: 

• Place returns value in a known place. 

• Restore activation record. 

Example: 5 —> call id (Elist) 

(for each item P on the queue Elist. 
Queue do gen ('PARAM', q); 
gen ('call:', id.place) ;} 

Elist —> Elist, E {append E.place to the end of 
Elist.queue} 

Elist —» E {initialize Elist.queue to contain only 
E.place} 

Use a queue to hold parameters, then generate codes for 
params. 

Code for £,, store in t , 


Code for £., store in t k 
PARAM t\ 


PARAM tk 
Call P 

Terminology: 

Procedure declaration: 
Parameters, formal parameters 
Procedure call: 

.Arguments, actual parameters. 
The values of a variable: .t = v 


llic . V al,.c of the variable, i.e., on the 
- valu ^ Fx . >% i n above assignment. 
ss ignmcnt- - ' .• wmldrcss of the varinki 


rh»ki 


1. call-by-valttc 

9 call-by-refcrcncc 

3 caii-by-valuc-result (copy-rcstorc) 
4. call-by-name 


Call by vo /ue 

Calling procedure copies the r values or, he ar gUmc 
me called proceduce’s Act.vat,on Record. 3l «o 

Changing a formal parameter has no effect on t hc 

parameter. 

Example: void add (int C) 


( 

c = c+ 10; 




.f i '\nc = %d' 


set 


main () 

{ 

int a = 5; 

printf ('a=%d' , 6a) ; 

add (a); 

printf ('\na = %d # / &a) ; 

} 

In main a will not be affected by calling add (a) 

It prints n = 5 
o = 5 

Only the value of C in add () will be changed to 15. 
Usage: 

1. Used by PASCAL and C++ if \vc use non-ir 
parameters. 

2. The only thing used in C. 

Advantages: 

1. No aliasing. 

2. Easier for static optimization analysis. 

3. Faster execution because of no need for redirect.i 


Call by reference 

Calling procedure copies thc Lvalues of the 2^^ 
the called procedure’s activation record. i- e - 
will be passed to the called procedure. 

• Changing formal parameter affects the co* 
actual parameter. 

• It will have some side effects. 

Example: void add (int *c) 

{ 

*c = *c + 10; 
printf ('\nc=%d', * c ' 1 ' 
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.'I', '»>; 
<!', a); 


ioi‘‘ ,pnW,n 

I „ r,. 

? f ( ’Sna 
P r U*'- 1 1 
,Jd 

t t (' v > na 

«nip‘ ,,:a 

C - 1 •’ 

15 

, , rc the actual parameter is also modified. 

^'“frtkicncyin passing large objects. 

;■ [' )n) y need to copy addresses. 

,i h^value-result 

C ° ' ^ . p ill-bv-rcfcrencc except when there is aliasing. 

produces die seme resell. M not «* 
ll “" 5 .vrlllrcucircmrcd. 

; «*** mm cxpressi0 " * 0,eu ' 
ment twice. 

Example: test (W) 

''TTnhcre is . —• -£££“* " ^ 

Call by-name 

. Each occurrence o» P ding argument. 

. A parameter is not eva 
during computation. 

Example: 

void shov. 1 (ir>r 

l _ 0 . y < 1° ; y++) 

for (int Y - U/ y 


n () 

- j; 

=. -1; 

(j) 


i 11 * j; 

For (1 r! •/- 0; y ' 10; V 
X 44; 

1 * t q func- 

* Instead of passing values or address as argum 
lion is passed for each argument. 

. These functions arc called d™" «• k is called, t c 

. lUcItlimeapercmelcrrs u^^ 

Hie address returned by the rvalue. 

^ = (I: use rclunr value of tircrir 

-hat are never 

Advantages parameter* 

. More efficient when p par am- 


used. . . use evaluating unused param 

This saves lot of time because 

cter takes a longtime. 


Code Generation ^ compilcr model. 
Code generation is the fina p . 


Input 

(or) 

Source 

program 



Front 


end 



Code 
optimization 


Intermediate I code 


Target 

program 


Code 
[generation | 


YM 


ually w 
n 0 


, : t«e t fa ^ 

.•ill »• i,K 


The requirements imposed on a code generator are 
t , Output code must be cancel 

2. Output code m “ s ^efficiently. 

3. code generator should nine 

• .he Design of a Code Generator 

Thegenerie hsuc^n th^design of code generators are 
. input to the code generator 

• Target programs 

. Memory Management 

• Instruction selection 
. Register Allocation 

. choice of Evaluation order 

Input to the code generator 

Intermediate representation with symbol table wi e 
input for the code generator. 

• High Level Intermediate representation 

Example: Abstract Syntax Tree (AST) 

• Medium - level intermediate representation 

Example: control flow graph of complex operations 

• Low - Level Intermediate representation 
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Example: Quadruples. DAGS 
• Code for abstract stack machine, i.c.. postfix code. 


Example: „, + 2 | nUms 


Mov,,„Hi tt 

0 l() ad „ S >31 


Target programs 

The output of the code generator is the target program. The 
output may take on a variety of forms: 


ADD r , a 

wv/< rrX/t; 




store g 


L r, into 


*'/ 


1. Absolute machine language 

2. Relocatable machine language 

3. Assembly language 


Absolute machine language 

• Final memory area for a program is statically known. 

• Hard coded addresses. 

• Sufficient for very simple systems. 

Advantages: 

• Fast for small programs 

• No separate compilation 


Register allocation 

• Instructions with register 
qucntly used values i n 

• Some registers are reserved fS ' 

Example: SP, PC ... etc 

Minimize number of loads and. 


I stores 


Evaluation order 


• Some orders require fewer register,, 
results. 8 ers to 


efli. 




hold 


Disadvantages: Can not call modules from other languages/ 
compilers. 


inter*. 


Relocatable code It Needs 

• Relocation table 

• Relocating linker + loader (or) runtime relocation in 
Memory management Unit (MMU). 

Advantage: More flexible. 


Assembly language Generates assembly code and use an 
assembler tool to convert this to binary (object) code. It needs 
(i) assembler (ii) linker and loader. 


Target Machine 

Lets us assume, the target computer is 

• Byte addressable with 4 bytes per word 

• It has n general purpose registers 

*o. ^i’ ^2’ ••• -^n-l 

• It has 2 address instructions of the form 
OP source, destination 

[cost: 1 + added] 


Example: The op may be MOV, ADD, MUL. 
Generally cost will be like this 


Advantage: Easier to handle and closer to machine. 


Memory management 

Mapping names in the source program to addresses of data 
objects in runtime memory is done by the front end and the 
code generator. 


Source 

Destination 

Cost 

Register 

Register 

1 

Register 

Memory 

2 

Memory 

Register 

2 

Memory 

Memory 

3 


Addressing modes: 


• A name in a three address statement refers to a symbol 

Mode 

Form 

Address_ 

entry for the name. 

Absolute 

M 

M 

• Stack, heap, garbage collection is done here. 

Register 

R 

R 


Indexed 

C(R) 

C-rContents(R) 

Instruction selection 

Indirect 

*R 

Contents (R) 

Instruction selection depends on the factors like 

register 

Indirect 

•C(R) 

Contents (Oco^ nB 

• Uniformity 

indexed 


—- 

• Completeness of the instruction 




• Instruction speed 

Example: x: 

=y-* 


• Machine idioms 

MOV/, R0 —> cost = 2 


Choose set of instructions equivalent to intermediate rep¬ 

SUB z, R0 —» cost = 2 

1 - • ’ 1 > A* 

resentation code. 

MOV x —> cost = 2 

. £| ••• 

Minimize execution time, used registers and code size. 


6 
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Storage Management 


organization 

5 t<?r3g Mlcd program, compiler will demand the oper- 
T^ oaC ?n for ‘>ic block of memory. This block of mem- 
j'^^ilcd runtime storage. 

or yb ca | n tinlC storage is subdivided into the generated 
' Tl " 5 d'' Data objects and Information which keeps track 

P^ ct activations. 

° , rr °^ fixed data (generated code) is stored at the statically 
TlU j,v’d area of the memory. The Target code is placed 
Jct ^ver end of the memory. 

3 ‘The data objects are stored at the statically determined 
' 1L jts s ize is known at the compile time. Compiler 
afe3 ,- these data objects at statically determined area 
5l0f ^ se t i lC se arc compiled into target code. This static data 
placed on the top of the code area. 
are p hc runtime storage contains stack and the heap. Stack 
tiins activation records and program counter, data 
h ’d within this activation record arc also stored in this 
!iack with relevant information. 

The heap area allocates the memory for the dynamic data 
(for example some data items are allocated under the pro- 

cram control) 

“ The s i ze of stack and heap will grow or shrink according 
to the program execution. 


Activation Record 

Information needed during an execution of a procedure is 
kept in a block of storage called an act,vat,on record. 

. Storage for names local to the procedures appears in the 

activation record. f t a . ac tjvation of 

. Each execution of a procedure ,s referred as aettva 

the procedure. . . 0 f j ts activation 

• If the procedure is recursive, several 

might be alive at a given time. 

• Runtime storage is subdivide 

1. Generated target code area 

2. Data objects area 

3. Stack 

4. Heap 



Stack 



change during program 
a iip'td can cnang* 

• Sizes of stack an dar d storage 

execution. are two - 

For code generation 
allocations: 
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1. Sialic allocation: The position of an activation 
record in memory is fixed at compile lime. 

2. Stack allocation: A new activation record is pushed 
on to the stack for each execution of the procedure. 

The record is poped when the activation ends. 

Control stack The control stack is used for managing active 
procedures, which means when a call occurs, the execution 
of activation is interrupted and status information o t ie 

stack is saved on the stack. . 

When control is returned from a call, the suspen c ‘ ' 

vation is resumed after storing the values of re evan 
isters it also includes program counter which se s p 
immediately after the call. 

The size of stack is not fixed. 

Scope of declarations Declaration scope refers; to the^cer 
tain program text portion, in which rules are defined by 

Within the defined scope, entity can access legally to 

^“fof declaration contains immediate scope 
always. Immediate scope is a region of declarative po 
with enclosure of declaration immediately. 

Scope starts at the beginning of declaration P 

continues till the end of declaration. Whereas m. theover 
loadable declaration, the immediate scope will begin, wh 

the callable entity profile was determined 

The visible part refers text portion of declaration, which 
is visible from outside. 

Flow Graph 

A flow graph is a graph representation of three address 
statement sequences. 

. Useful for code generation algorithms. 

• Nodes in the flow graph represents computations. 

• Edges represent flow of control. 

Basic Blocks 

Basic blocks are sequences of consecutive statements in 
which flow of control enters at the beginning and leaves at 
the end without a halt or branching. 

1. First determine the set of leaders 

• First statement is leader 

• Any target of goto is a leader 

• Any statement that follows a goto is a leader. 

2. For each leader its basic block consists of the leader 
and all statements up to next leader. 

Initial node: Block with first statement is leader. 

Example: consider the following fragment of code that 

computes dot product of two vectors jc and v of length 10 
begin fc 

Prod: = 0; 
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I — ' „ 

*£»7£ f" »*** 

> ZIXSSZ?* . . 

•** acu,ally 

computing the value. 

Example: CMP .v,;’ 

Snso »n« 0 n **%? «•* ** " 

name that last sets the condition code . 


If x < 0 goto z 
By 

MOV v, R 0 
ADD z,R n 


MOV R 0 , x 
CJN z. 


Code Generation from DAG: 


5, = 4 * / 

.S' ; = add(A) - 4 
•S’= .S' 2 [.V,J 
*V, = 4 * / 

5j = add(B) - 4 

ft] 

5; = prod+ 5, 
prod = Sx 

s g =f +1 

/ = •$'9 

jf/< = 20 got (1) 


6 ’, = 4 * / 

*^2 = addfA) „ . 

5j= W 

* V s = add(B K4 

,s 7 = ‘Vs; 

P r °d = prod + 5 


/ = /+! 
if/< = 2 0gotf|) 


Rearranging order of the code 

Consider the following basic block 
t{.= a + b 
t 2 := c + d 


DAG Representation 
of Basic Blocks 

• DAGS are useful data structures for implementing trans¬ 
formations on basic blocks. 

• Tells, how value computed by a statement is used in sub¬ 
sequent statements. 

• It is a good way of determining common sub expressions. 

• A DAG for a basic block has following labels on the nodes: 

• Leaves are labeled by unique identifiers, either variable 
names or constants. 

• Interior nodes are labeled by an operator symbol. 

• Nodes are also optionally given as a sequence of identi¬ 
fiers for labels. 

Example: l:r,:=4*/ 

2:t 2 :=a[t i ] 

3: f 3 :=4 * 1 
4: t A := b [/ 3 ] 

5 :t 5 :=t 2 *l A 

6: l 6 := prod + 1 } 

7: prod: = t, 

8: t 7 := i + 1 

9:/=/ 7 

10: if 1 < = 20 got (1) 




Three address code for the DAG: 
(Assuming only two registers are available) 
MOV a, R o 
ADD b, R 

o 

MOV c, R, 

MOV R o , r. Register Spilling 

MOV e, R o Register Reloading 

SUB R r R o 
MO V 
SUB R o ,R t 
MOV R { ,x 

Rearranging the code as 
/,:= c + d 
t } :=e-t 2 
t x \=a + b 
x = t [ -t i 

The rearrangement gives the code: 

MOV c, R u 
ADD d, R o 
MOV e, r\ 

SUB R, R t 

o’ I 
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.P- 

.Q 1 

ci 


}■,, A, 

/? 


,0' ' • 


Chap( Gr 3 . 
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2 - E,i minat!nc rCdUndam in5,ruc,ion<i 
Flow of o' 8 Unrcachab l c code 


The ^ 

^.errors 
, sjutactic errors 
I jjniantic errors 


Algebraic simplifications 
J' f^ngth reduction 
• Use of machine idioms 


Flo\v 0 r c codc 

0ver jump° n,r0 ' 0 F >, ' rn ' zn, 's>ns or Eliminating jumps 


, Run 


-time errors 


olion of Redundant Loads and stores 


L(X ical error* If the variable (or) constants arc declared 
^defined not aecord.ng to the rules of language s „ Z • 1 
Ltnbols are included which were not part of the 1^ Sample 2:~ Load r 
2 C is the lexical error. age ’ 

e Lexical analyzer is constructed based on pattern recog 
rJZ j n e rules to form a token, when a source code is made 
irt0 tokens and if these tokens are not according to rules 
tien errors are generated. 


Elim 

Exnmple 1 : (l)MOV/J,„ 
w (2) MOV a, R a 

c can delete instruction ”(2). because the value of a is 
already in R. 


U 

Storefl 0 ,.v 

If no modifications to RJx then store instruction can be 
deleted 


Consider a cprogram statement 
printf (‘Hello World'); 

Main printf, (, \ Hello world,',),; are tokens. 

Printf is not recognizable pattern, actually it should be 
pnntf. It generates an error. 

Syntactic error These errors include semi colons, missing 
braces etc. which are according to language rules. 

The parser reports the errors 

Semantic errors This type of errors arises, when operation 
is performed over incompatible type of variables, double 
declaration, assigning values to undefined variables etc. 

Runtime errors The Runtime errors arc the one_ which are 
detected at runtime. These include pointers W d J 
NULL values and accessing a variable whtch ts out of tts 
boundary-, unlegible arithmetic operat.ons 
After the detection of errors. Th 
strategies should be implemented. 

1. Panic mode recover} 7 

2. Phrase level recovery 

3. Error production 

4. Global correction. 

Peephole Optimization and 

• Target code often contains redundant in 

suboptimal constructs. program by 

• Improving the performance of the targe (peep- 

examining a short sequence of target s | l0 rter or 

hole) and replacing these instructions ) 

faster sequence is peephole optimization. target 

• The peephole is a small, moving vvind0 ^ izat i 0 ns are 


program. Some well known peephole optimi 


Example 3: ( 1 ) Load jt, R Q 
(2) Store R q% x 
Example 4: ( I ) store R {) , x 
(2) Load .v, R 0 

Second instruction can be deleted from both examples 3 and 4. 
Examples: Store R Q ,x 
Load x, R 0 

Here load instruction can be deleted. 
Eliminating Unreachable code 

An unlabeled instruction immediately following and uncon¬ 
ditional jump may be removed. 

• May be produced due to debugging code intro¬ 
duced during development. 

• May be due to updates in programs without consid¬ 
ering the whole program segment. 

Example: Let print = 0 


if print = 1 goto 
goto L 2 

L t : print in 

-N 

-✓ 

if print 1 = 1 goto L 2 
print instructions 

L 2 : 



II 

goto L 2 

print instructions 
L 2 : 

/- 

Si— 

it oi = 1 goto l 2 
print instructions 

l 2 . 


In all of the above cases pum • 

Print instructions can be eliminated. 

Example: goto L, 

fl0 „ 0 f control optimizations The unnecessary jumps can 

be eliminated- 
Jumps like: 

s:ss-^ 
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